Ding Luo 


High-speed surface 
profilometry based on an 
adaptive microscope with 
axial chromatic encoding 


Schriftenreihe Automatische Sichtprüfung und Bildverarbeitung | Band 18 


SICHT zusishing 


Ding Luo 


High-speed surface 

profilometry based on an 
adaptive microscope with 
axial chromatic encoding 


Schriftenreihe Automatische Sichtprüfung und Bildverarbeitung 
Band 18 


Herausgeber: Prof. Dr.-Ing. habil. Jürgen Beyerer 


Lehrstuhl für Interaktive Echtzeitsysteme 
am Karlsruher Institut für Technologie 


Fraunhofer-Institut für Optronik, Systemtechnik 
und Bildauswertung IOSB 


High-speed surface 

profilometry based on an 
adaptive microscope with 
axial chromatic encoding 


by 
Ding Luo 


SICHT zusishing 


Karlsruher Institut für Technologie 
Lehrstuhl für Interaktive Echtzeitsysteme 


High-speed surface profilometry based on an adaptive microscope 
with axial chromatic encoding 


Zur Erlangung des akademischen Grades eines Doktor-Ingenieurs 
von der KIT-Fakultät für Informatik des Karlsruher Instituts für 
Technologie (KIT) genehmigte Dissertation 


von Ding Luo 


Tag der mündlichen Prüfung: 18. Dezember 2019 
Erster Gutachter: Prof. Dr.-Ing. habil. Jürgen Beyerer 
Zweiter Gutachter: Prof. Dr. rer. nat. Wilhelm Stork 


Impressum 
NC Scientific 
Publishing 
Karlsruher Institut für Technologie (KIT) 
KIT Scientific Publishing 


StraBe am Forum 2 
D-76131 Karlsruhe 


KIT Scientific Publishing is a registered trademark 
of Karlsruhe Institute of Technology. 
Reprint using the book cover is not allowed. 


www.ksp.kit.edu 


© OO This document - excluding the cover, pictures and graphs - is licensed 
D4 under a Creative Commons Attribution-Share Alike 4.0 International License 
(CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/deed.en 


© OXO) The cover page is licensed under a Creative Commons 
Rare Attribution-No Derivatives 4.0 International License (CC BY-ND 4.0): 
https://creativecommons.org/licenses/by-nd/4.0/deed.en 


Print on Demand 2021 - Gedruckt auf FSC-zertifiziertem Papier 


ISSN 1866-5934 
ISBN 978-3-7315-1061-1 
DOI 10.5445/KSP/1000125427 


Abstract 


For the quality assurance of a technical part, the three-dimensional (3D) ge- 
ometric profile of the working surface is often one of the most important as- 
pects, which directly affects the functionality of the part in a fundamental 
way. For example, the roughness of the working surface is typically under 
careful inspection to guarantee specific mechanical properties during its inter- 
action with the environment or the other components. Over the past decades, 
optical 3D surface profilometry has gained an increasing amount of attention 
for such applications in both academic and industrial environments, due to its 
capability of non-contact measurement and high resolution. Various optical 
probes are designed to interact with the target surface, in order to reveal the 
underlying 3D structure. 


With the initialization of Industry 4.0, modern “smart factories” are posing 
new challenges to surface profilometry technologies, demanding swift adap- 
tation to different inspection tasks with fast measurement speed and high 
accuracy. Such challenges are difficult for conventional optical profilometry 
methods, as they are restricted by the fundamental dilemma between accu- 
racy and speed. Technology such as the confocal scanning microscopy is cel- 
ebrated for its superior resolution and accuracy while suffering from a slow 
measurement speed due to its requirement of the mechanical scanning as well 
as a low density of measurement to avoid crosstalk. On the contrary, method 
such as shape from focus (SFF) measures all lateral locations simultaneously, 
which is much more efficient. Nevertheless, the resolution and accuracy of 
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the measurement are degraded accordingly. In this thesis, a cascade measure- 
ment strategy is proposed for optical surface profilometry based on an adap- 
tive microscope, which consists of a pre-measurement stage to limit the ax- 
ial measurement range, a main measurement stage, and a post-measurement 
stage for refinement. 


To realize such a strategy, an adaptive microscope with axial chromatic en- 
coding is first designed and developed, namely the AdaScope. With a holistic 
design approach, the AdaScope consists of two major components. Firstly, 
the programmable light source is based on a supercontinuum laser, whose 
echellogram is spatially filtered by a digital micromirror device (DMD). By 
sending different patterns to the DMD, arbitrary spectra can be generated for 
the output light. Secondly, the programmable array microscope is constructed 
based on a second DMD, which serves as a programmable array of secondary 
light source. A chromatic objective is utilized so that the necessity of axial 
mechanical scanning is avoided. The combination of both components grants 
the AdaScope the ability to confocally address any locations within the mea- 
surement volume, which provides the hardware foundation for the cascade 
measurement strategy. 


For the pre-measurement stage, a compressive shape from focus (CSFF) 
method is proposed, where the focal stack is captured in a compressive 
manner. Each frame is a weighted linear combination of all focal planes 
along the optical axis, which improves the efficiency of the capturing process. 
Compared to conventional SFF method, the image acquisition is 7 times 
faster. 


Two methods are proposed for the main measurement stage. The iterative 
array adaptation method is based on the conventional confocal array scan- 
ning. Multiple iterations of lateral array scanning are performed for a single 
measurement. From iteration to iteration, the array density is increased while 
the axial measurement range is reduced accordingly to avoid crosstalk. Lin- 
ear measurement based on two ramp illumination spectra is proposed for the 
axial scan to efficiently capture information regarding the surface profile. 


II 


Abstract 


The other candidate for the main measurement stage is direct area confocal 
scanning based on tilted illumination field. It is demonstrated both theoret- 
ically and experimentally that the confocal signal is largely preserved even 
for a wide-field illumination, as long as the illumination is tilted to a specific 
angle range according to the numerical aperture of the system. This leads to 
a much improved measurement speed with a moderately reduced sensitivity. 


Last but not least, for post-measurement refinement, a dynamic sampling 
approach is developed based on Bayesian experimental design (BED). The 
calculation of the utility function involves numerical integration conducted 
through Monte Carlo sampling, which is computationally expensive. To 
accelerate the process, a recurrent neural network (RNN) is developed and 
trained to approximate the BED process. According to the simulation result, 
this approach is able to achieve a performance between uniform sampling 
and full BED, with a speed improvement of 600 times. 
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Kurzfassung 


Für die Qualitätssicherung eines technischen Teils ist das dreidimensionale 
geometrische Profil einer Funktionsoberfläche oft einer der wichtigsten 
Aspekte, welcher die Funktionalität des Teils in grundlegender Weise 
direkt beeinflusst. Beispielsweise wird die Rauheit der Funktionsfläche 
normalerweise sorgfältig geprüft, um bestimmte mechanische Eigenschaf- 
ten während ihrer Wechselwirkung mit der Umgebung oder anderen 
Bauteilen zu gewährleisten. In den letzten Jahrzehnten hat die optische 
3D-Oberflächenprofilometrie aufgrund ihrer Fähigkeit zur berührungslo- 
sen Messung und hohen Auflösung für solche Anwendungen sowohl im 
akademischen als auch im industriellen Umfeld zunehmend an Bedeutung 
gewonnen. Für die Erfassung von Oberfl Zieloberflächen wurden verschie- 
dene optische Sonden entwickelt, um die zugrunde liegende 3D-Struktur zu 


messen. 


Mit dem Aufkommen von Industrie 4.0 stellen moderne intelligente Fabriken 
neue Herausforderungen an die Oberflächenmesstechnik. Sie erfordern eine 
schnelle Anpassung an verschiedene Inspektionsaufgaben mit hoher Mess- 
geschwindigkeit und hoher Genauigkeit. Solche Herausforderungen sind für 
herkömmliche optische Profilometrieverfahren schwierig, da sie durch das 
grundlegende Dilemma zwischen Genauigkeit und Geschwindigkeit begrenzt 
sind. Technologien wie das Konfokalmikroskop sind bekannt für ihre überle- 
gene Auflösung und Genauigkeit, leiden aber unter einer geringen Messge- 
schwindigkeit, da ein mechanisches Scannen sowie eine geringe Messdich- 
te zur Vermeidung von lateralem Übersprechen erforderlich sind. Im Gegen- 
teil, eine Methode wie Shape from Focus misst alle benachbarten Positionen 
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gleichzeitig, was wesentlich effizienter ist. Allerdings verschlechtern sich Auf- 
lösung und Genauigkeit der Messung entsprechend. In dieser Arbeit wird ei- 
ne Kaskadenmessstrategie für die optische Oberflächenprofilometrie vorge- 
schlagen, die auf einem adaptiven Mikroskop basiert und aus drei Messstufen 
besteht: einer Vormessstufe zur Begrenzung des axialen Messbereichs, einer 
Hauptmessstufe und einer Nachmessstufe zur Verfeinerung. 


Um eine solche Strategie umzusetzen, wird zunächst ein adaptives Mikro- 
skop mit axialer chromatischer Codierung entworfen und entwickelt, das 
sogenannte AdaScope. Mit einem ganzheitlichen Designansatz besteht das 
AdaScope aus zwei Hauptkomponenten. Erstens basiert die programmier- 
bare Lichtquelle auf einem Weißlichtlaser, dessen Echellogramm durch 
ein Digital Mirror Device (DMD) räumlich gefiltert wird. Durch Senden 
verschiedener Muster an den DMD können beliebige Ausgangslichtspektren 
erzeugt werden. Zweitens basiert das programmierbare Array-Mikroskop auf 
einer zweiten DMD, der als programmierbare Anordnung einer sekundären 
Lichtquelle dient. Ein chromatisches Objektiv wird verwendet, um die 
Notwendigkeit einer axialen mechanischen Abtastung zu vermeiden. Die 
Kombination beider Komponenten ermöglicht es dem AdaScope, beliebi- 
ge Stellen innerhalb des Messvolumens konfokal anzusprechen, was die 
Hardware-Grundlage für die Kaskaden-Messstrategie bildet. 


Für die Vormessphase wird eine Compressive Shape from Focus-Methode vor- 
geschlagen, bei der der Fokusstapel auf komprimierende Weise erfasst wird. 
Jeder Frame ist eine gewichtete lineare Kombination aller Fokusebenen ent- 
lang der optischen Achse, was die Effizienz des Erfassungsprozesses verbes- 
sert. Im Vergleich zur herkömmlichen Methode Shape from Focus ist die Bild- 
aufnahme siebenmal schneller. 


Für die Hauptmessstufe werden zwei Methoden vorgeschlagen. Das iterati- 
ve Anordnungsanpassungsverfahren basiert auf herkömmlichem konfokalen 
Abtasten. Für eine einzelne Messung werden mehrere Iterationen des late- 
ralen Array-Scannens durchgeführt. Von Iteration zu Iteration wird die Ar- 
raydichte erhöht, während der axiale Messbereich entsprechend verringert 
wird, um ein laterales Übersprechen zu vermeiden. Für den axialen Scan wird 
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eine lineare Messung basierend auf zwei Rampenbeleuchtungsspektren vor- 
geschlagen, um Informationen bezüglich des Oberflächenprofils effizient zu 
erfassen. 


Der andere Kandidat für die Hauptmessstufe ist das direkte konfokale Scan- 
nen basierend auf einem geneigten Beleuchtungsfeld. Sowohl theoretisch als 
auch experimentell wird gezeigt, dass das konfokale Signal auch bei einer 
Hellfeldbeleuchtung weitgehend erhalten bleibt, solange die Beleuchtung ent- 
sprechend der numerischen Apertur des Systems auf einen bestimmten Win- 
kelbereich geneigt wird. Dies führt zu einer deutlich verbesserten Messge- 
schwindigkeit bei moderat reduzierter Empfindlichkeit. 


Zu guter Letzt wird zur Verfeinerung nach der Messung ein dynamischer Ab- 
tastansatz entwickelt, der auf dem Bayesian Experimental Design basiert. Die 
Berechnung der Nutzenfunktion beinhaltet eine numerische Integration, die 
durch Monte-Carlo-Abtastung angenähert wird, was jedoch rechenintensiv 
ist. Um den Prozess zu beschleunigen, wird ein Recurrent Neural Network 
entwickelt und trainiert, um den Bayesian Experimental Design-Prozess zu 
approximieren. Entsprechend dem Simulationsergebnis ist dieser Ansatz in 
der Lage, eine Leistung zwischen einheitlicher Abtastung und vollständigem 
Bayesian Experimental Design mit einer Geschwindigkeitsverbesserung um 
das 600-fache zu erzielen. 
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Notation 


This chapter introduces the notation and symbols which are used in this thesis. 


General notation 


Identifiers & Operators Roman letters 
Scalars italic Roman letters 

italic Greek letters 
Vectors bold Roman and Greek lowercase letters 
Matrices & Tensors bold Roman and Greek uppercase letters 
Sets blackboard bold Roman uppercase letters 
Distributions calligraphic Roman uppercase letters 


Symbols 


ou 
(OJ 
Il, 
öz 


transpose of a matrix 

Moore-Penrose pseudoinverse of a matrix 
Euclidean norm 

small axial deviation from the focal plane 
free spectral range 


tilting angle of the illumination field 
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Notation 


On 

dur 

0 = (01,02) 
A 

Àt 


oO 

on 

dp 
@,(A,t) 


©): 


AG) 
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incidence angle of light 

diffraction angle of light at a diffraction order of m 
parameters to be estimated from the confocal signal 
wavelength of light 

wavelength to be measured at a certain time 
shortest wavelength in the free spectral range 
longest wavelength in the free spectral range 
parameter of the Gaussian function 


possible design for the next experiment in Bayesian 
experimental design 


optimum design for the next experiment in Bayesian 
experimental design 


parameter of the Gaussian function 

variance of noise 

blazing angle of grating 

temporal spectral flux of the programmable light source 


discrete temporal spectral flux of the programmable light 


source as a matrix 


factors for the variance and bias of the Monte Carlo 
estimator for the utility function 


axial chromatic focal shift 

amplitude of a Gaussian function 

measurement matrix 

bias of the corresponding layer in a neural network 
center of gravity 

a factor to match scaling and/or unit 


component of 3D point spread function 


Notation 


EC) 
Ea(%asYa-) 
Ep(xp.¥p-A, t) 
ej 


E, Ey 


h(u,v) 

HC) 
Hı(x,y,2,A) 
IC) 

Int‘) 


Kullback-Leibler divergence 

grating period 

physical width of aDMD pixel 

object distance to the lens 

image distance to the lens 

lateral domain of the spectral DMD 

wavelength range of the system 

expectation value 

spectral flux density of the echellogram 

temporal spectral flux density incident on the spatial DMD 
discrete spectral flux of the i-th pixel on the spectral DMD 
discrete spectral flux density of the echellogram as a matrix 


discrete spectral flux density of the echellogram as a 3D 
tensor 


focal length 

activation function of a neural network layer 

3D mask function for the illumination distribution 
1D Laplacian filter 

measurement at a certain wavelength 

measurement at a certain wavelength at a certain time 
expectation of the measurement 

a collection of all current measurements 

amplitute point spread function 

intensity point spread function 

normalized chromatic intensity point spread function 
intensity response to a point object 


integrated intensity response 
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Notation 


In) 
k 

k; 

l 

l; 


A 
Ta(Xa Yat) 
rn (X, Y5»8) 
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Bessel function of first kind of order n 
wave number 

LSTM layer in a neural network 

1D index of DMD pixels 


hidden neural network layer to encode the measurement 
location 


diffraction order 


hidden neural network layer to encode the measured 
intensity 


parameter of Monte Carlo sampling for the utility function 
magnification of the optical system 

number of effective spectral DMD pixels 

number of spatial DMD pixels 

array pitch distance in terms of DMD pixels 

number of time steps 

number of axial positions 

parameter of Monte Carlo sampling for the utility function 
Gaussian distribution 

output layer of a neural network 

probability density 


spectral energy distribution of the programmable light 
source 


discrete spectral energy distribution of the programmable 
light source as a vector 


radial coordinate 
temporal reflectivity of the spectral DMD 
temporal reflectivity of the spatial DMD 


Notation 


Ra(Xa,ya) 


X, y, Z 
Xa» Ya 
Xb» Vb 
Xp, Yp 


Xc Ye 


average reflectivity of the spectral DMD 

discrete average reflectivity of the spectral DMD 
discrete reflectivity of the spectral DMD as a matrix 
discrete reflectivity of the spatial DMD as a matrix 
real numbers 

numerical aperture 

hidden layer of a neural network 

component of 3D point spread function 


area of the homogenized illumination from the 
programmable light source 


time 

integration time of the spectral DMD 
integration time of the spatial DMD 
utility for Bayesian experimental design 
axial optical coordinate 

illumination distribution of the AdaScope 


discrete illumination distribution of the AdaScope as a 
matrix 


radial optical coordinate 
weight of the corresponding layer in a neural network 


a collection of weight matrices for the LSTM layer in a 
neural network 


world coordinates 

spectral DMD coordinates 
spatial DMD coordinates 
spatial DMD pixel coordinates 


camera coordinates 
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Notation 


Xi the i-th component of the signal vector 

x signal vector 

X, reconstructed signal vector 

yi the i-th component of the measurement vector 
y measurement vector 


Acronyms 


2D 
3D 


AOL 
AOTF 


BED 


CAD 
CCD 
CCSI 
CLS 

COG 
CSFF 


DLP 
DMD 
DOE 
EOL 


FoV 
FWHM 


HDMI 


two-dimensional 
three-dimensional 


acousto-optic lens 
acousto-optic tunable filter 


Bayesian experimental design 


Computer-aided Design 

charge-coupled device 

chromatic confocal spectral interferometry 
confocal line scan 

center of gravity 

compressive shape from focus 


Digital Light Processing 
digital micromirror device 
diffractive optical element 


electro-optic lens 


field of view 
full width at half maximum 


High-Definition Multimedia Interface 
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Acronyms 


IES 


IOSB 


KIT 
KL 


LAPD 
LAPM 
LCoS 


MC 
MCMC 
MEMS 


N/A 
NA 
ND 


PAM 
PCA 
PDLC 
PSF 


RNN 


sCMOS 
SFF 
SFIL 
SLM 
SNR 
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German: Lehrstuhl für Interaktive Echtzeitsysteme, 
Vision and Fusion Laboratory 

German: Fraunhofer-Institut für Optronik, 
Systemtechnik und Bildauswertung, 

Fraunhofer Institute of Optronics, System 
Technologies and Image Exploitation 


Karlsruhe Institute of Technology 
Kullback-Leibler 


diagonal Laplacian operator 
modified Laplacian operator 
liquid crystal on silicon 


Monte Carlo 
Markov-Chain Monte Carlo 
micro-electro-mechanical systems 


not applicable 
numerical aperture 
neutral density 


programmable array microscope 
principle component analysis 
polymer-dispersed liquid crystal 
point spread function 


recurrent neural network 


scientific complementary metal-oxide-semiconductor 
shape from focus 

steerable filters algorithm 

spatial light modulator 

signal-to-noise ratio 


Acronyms 


SVGA Super Video Graphics Array, equivalent to a 
resolution of 800 x 600 


TENG Tenegrad algorithm 


WDM wavelength division multiplexing 
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1 Introduction 


1.1 Motivation 


As a high-tech strategy first launched by the German government around 
2012, Industry 4.0 aims to realize a “smart factory” by combining automation 
and data exchange in manufacturing technologies, where computerization of 
manufacturing is promoted. To achieve a self-optimizing production envi- 
ronment, great demands have been placed upon advanced sensors for quality 
monitoring. Such a flexible production system requires individual inspection 
tasks to be solved within the production cycle. And due to the desired vast 
individuality ofthe manufactured products, statistical quality assessment by 
means of random sampling is no longer sufficient. A “smart measurement 
machine” is thus needed to swiftly adapt to different inspection tasks on-site. 


Out of the various properties of a technical part, the three-dimensional ge- 
ometric profile of the working surface is often one of the most important 
aspects for quality assurance, which directly affects the functionality of the 
productinafundamental way. For example, roughness ofthe working surface 
is typically under careful inspection to guarantee specific mechanical proper- 
ties during its interaction with the environment or the other components. As 
another example, Figure |1.1| demonstrates the measurement result of a laser 
welding seam using a confocal line scan (CLS) system. The surface profile of 
the laser welding seam directly reflects the quality of the welding process and 
reveals possible defects inside the welding area, which might lead to malfunc- 
tion or damage, e.g., the area within the red box in the height map shows a 
drop of the seam height. Structural characteristics of a surface, such as step, 
flatness or curvature, are also common subjects for inspection. 
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Figure 1.1: Measurement of a laser welding seam using Precitec CLS system. A defect of the 
welding seam has been labeled by the red box in the height map. 


To solve these tasks, the conventional surface profilometer has been applied, 
which consists of a mechanical stylus in contact with the target under in- 
spection. The movement of the stylus is detected and recorded as the target 
is scanned, which reflects its 3D profile. In recent decades, such mechani- 
cal methods have been widely replaced by optical methods, due to their ad- 
vantage of non-contact measurement as well as better resolution, which are 
advantageous in terms of both robustness and applicability. Various kinds 
of optical probes are designed to interact with the target surface, in order to 
reveal the underlying 3D structure. 


One prominent method is the confocal microscopy, which was invented by 
Minsky in the late 1950s. Due to its high resolution in both the lat- 
eral direction and the axial direction, confocal microscopy has attracted much 
attention since the beginning. With a huge amount of research effort invested 
over the years, it has become a powerful tool for a wide range of applications, 
including scientific and industrial inspection of 3D surface profiles. 


Unlike a conventional wide-field microscope, where an area illumination is 
applied, in a confocal system such as illustrated in Figure a pseudo-point 
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Figure 1.2: Schematic of a point confocal profile measurement system. 


light source is used, which is typically achieved through filtering a normal 
light source with a small pinhole. Light coming out of the pinhole is focused 
onto the target sample surface, forming a point illumination. When the object 
lies exactly at the position of the focal plane, reflected light from the object 
surface will be able to pass through the second pinhole placed at the conjugate 
position in the detection arm, thus generating a high intensity value in the 
light detector. When the object moves away from the focal plane, the reflected 
light will form a blurred spot on the pinhole in front of the light detector. 
In this case, most of the light will be blocked by the pinhole and therefore 
cannot reach the light detector. By scanning the object axially in a predefined 
range and recording the filtered light intensity simultaneously, an intensity 
peak will arise, whose position indicates the height of the surface point under 
inspection. Through additional two-dimensional (2D) lateral scanning, the 3D 
profile of the target sample can be reconstructed. 
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With the development of digital image sensors and the advancement of com- 
putational power, various kinds of image processing techniques have been in- 
vented to retrieve information from the captured images. Inatypically camera 
system, when the object lies within the focal plane, the corresponding image 
appears to be sharp with high-frequency spatial components clearly visible. 
However, objects out of focus will appear to be blurred, as if filtered by a low- 
pass filter. Based on this observation, S. K. Nayer [Nay89] first proposed the 
method of shape from focus in 1989. A series of images are taken, in which 
the focal planes of the imaging system are varied axially with respect to the 
target sample. For each lateral position of the object, its corresponding ax- 
ial position can be retrieved by analyzing the sharpness of its adjacent area 
through the image stack. The image with the highest sharpness level leads 
to the focal plane which is the closest to the underlying lateral position. By 
analyzing each lateral position, the complete 3D profile of the surface can be 
reconstructed. 


Without the necessity of lateral scanning, SFF methods are typically much 
faster than confocal measurement methods. However, due to the dependence 
on the surface texture, SFF methods are less robust compared to the confo- 
cal technologies and can only be utilized on surfaces with sufficient amount 
of texture. Additionally, computation of the sharpness measure requires the 
consideration of a sizable area adjacent to the inspected location, which ef- 
fectively lowers the lateral resolution of the SFF methods. 


To face the challenges presented by Industry 4.0, new measurement methods 
are urgently required, which can swiftly adapt to various kinds of surface pro- 
file inspection tasks. A holistic design approach must be adopted to account 
for different surface characteristics and scales. Fast measurement data acqui- 
sition should be coupled with advanced data processing algorithms to achieve 
an efficient measurement process. Additionally, the system should be able to 
incorporate prior knowledge regarding the product under inspection, in order 
to further increase the measurement speed. 


1.2 Research Topics 


1.2 Research Topics 


The work presented in this thesis focuses on the advancement of microscopic 
surface profilometry technologies. The research problems can be categorized 
into the following aspects. 


Optical Scanning with Minimum Mechanical Movement 


The switch from a physical stylus driven by mechanical movement to an opti- 
cal probe represents one of the most important advancement in the develop- 
ment of surface profilometry. The robustness and applicability of the mea- 
surement system is significantly improved by removing the physical contact 
between the measurement system and the target sample. However, macro 
mechanical movement between the probe and the sample is still required in 
most cases. For example, with a commercial chromatic line scan sensor such 
as the CHRocodile CLS by Precitec GmbH, at least one additional linear axis 
is required to achieve a complete three-dimensional measurement of a tar- 
get area. For more complex situations, a two-dimensional positioning table 
and possibly an additional rotation stage are required to perform the measure- 
ment tasks. However, the necessity of mechanical movement presents several 
disadvantages. 


Firstly, mechanical movement lowers the robustness of the complete measure- 
ment system. Due to the physical contacts between the components, mechan- 
ical movement systems are intrinsically more vulnerable to wear and malfunc- 
tion. More maintenance effort has to be invested to ensure full functionality. 


Secondly, synchronization between the measurement system and the mechan- 
ical movement further complicates system configuration. This is particularly 
critical for high precision microscopic measurement, since the accuracy of the 
measurement is directly limited by the accuracy of the mechanical movement. 


Thirdly, the dependency on mechanical scanning restricts the adaptability of 
the measurement system on various kinds of tasks. The mechanical scanning 
system is typically designed and implemented for a particular measurement 
task. Firm connections are applied as much as possible to assure movement 
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accuracy. For optimum results, the complete scanning system has to be re- 
designed and reassembled when new tasks arise. 


With the advancement of mechanical scanning devices in the past years, some 
ofthe aforementioned problems can be largely mitigated through application 
of state-of-the-art hardware. For example, mechanical wear can be signif- 
icantly reduced through the utilization of air bearing. However, such hard- 
ware requests a significant amount of investment, which adds to the final cost 
of the measurement system. Meanwhile, other problems remain unsolved as 
long as mechanical scanning is adopted. 


Therefore, this work aims to adopt a full optical scanning approach through 
the design and development of a novel measurement system. 


Efficient Optical Information Acquisition 


The development of modern computers has in many ways exceeded the 
improvement of electronic optical detectors in recent years. Although high 
speed cameras are frequently launched by manufacturers, their development 
speed is generally much slower compared to personal computers. Even 
with state-of-the-art high-speed cameras, the transfer of the image data 
poses new challenges on the bandwidth of communication, which limits 
the speed of measurement. A faster measurement system demands more 
efficient information acquisition and retrieval methods. In the era of digital 
imaging, this is equivalent to reducing the number of images needed for a 
certain measurement task. This requires a thorough investigation into the 
fundamental problems of the existing methods. For 3D surface measurement, 
this consists of two aspects. 


In the lateral directions perpendicular to the optical axis, the most efficient 
way to accelerate the measurement speed is to place a magnitude of point 
measurement devices as densely as possible so that these locations can be 
measured simultaneously. However, for confocal methods, the measurement 
relies on the blurring of light when object is out of focus and is therefore in- 
trinsically vulnerable to crosstalk when two measurement points are placed 


too close to each other. Sheppard and Mao [Shesg] have demonstrated the 
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possibility of a slit scanning confocal system through their theoretical analy- 
sis. Although infinitely dense in one lateral direction, only one location can 
be measured in the other direction. And for a 2D grid of confocal measure- 
ment points, the minimum pitch between the adjacent points have to be sat- 
isfied to avoid any crosstalk which might degrade the measurement accu- 
racy, thus limiting the maximum lateral density of the measurement device. 
Even worse, a longer axial measurement range leads to a higher possibility 
of crosstalk. Additionally, when high numerical aperture (NA) optics are ap- 
plied for increased resolution, the system also becomes more vulnerable to 
lateral crosstalk. This problem is also more severe when a 2D imaging sensor 
is applied, since the image captured and transferred contains largely unoccu- 
pied areas with little information regarding the actual measurement locations, 


which is highly inefficient. 


As the density of a 2D confocal measurement grid increases, the gradually 
increased crosstalk reduces the confocal microscope to a conventional wide- 
field microscope, which loses its depth discerning capability. Luckily, with 
sufficient amount of surface texture, methods such as SFF can be applied to 
retrieve the 3D profile of the surface, albeit with reduced resolution and accu- 
racy compared to the equivalent confocal setup. Nevertheless, SFF methods 
still suffer from an efficiency problem considering the axial scanning process, 
which also happens for the confocal methods. For both cases, an axial signal 
has to be retrieved with a Gaussian-like peak, whose position indicates the rel- 
ative distance between the measurement system and the target object. There- 
fore, for both approaches, a stack of measurements have to be acquired while 
the optical probe (point or plane of focus) is scanned axially. To accurately 
locate the peak position of the axial signal, a uniform (equidistant) sampling 
approach is typically adopted. Although widely applied, such a sampling ap- 
proach is highly inefficient as most of the sampled values are close to zero, 
which contains very little information regarding the position of the signal 
peak. According to estimation theory [vano7], the measurement uncertainty 
is directly related with the gradient of the estimator, i.e., the slope of the signal 
in this case. Therefore, an objective with higher NA is preferred to generate a 
peak as narrow as possible. For a fixed measurement range, a narrower peak 
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requires denser sampling to locate accurately, further lowering the efficiency 
ofthe measurement process. 


To alleviate the aforementioned issues, several measurement methods have 
been developed to enhance the efficiency of optical information acquisition 
in surface profilometry. 


1.3 Main Contributions 


This thesis aims to develop an optical system for high-speed surface profilom- 
etry with a holistic approach. The main contributions are listed below. 


Firstly, an adaptive microscope with axial chromatic encoding has been de- 
signed and constructed, namely the AdaScope. 


e programmable light source has been constructed based on the echellogram 
of a super-continuum laser and a DMD [Luo17c]. The system is capable 
of generating spectral peaks with a minimum full width at half maximum 
(FWHM) of less than Inm. When acting as a scanning bandpass filter, the 
wavelength tuning resolution can reach as small as 0.01 nm. 


e A programmable array microscope has been proposed, which is coupled 
with the programmable light source to allow for the generation of arbitrary 


3D illumination field [Luo13]. 


Based on the AdaScope platform, various 3D measurement principles have 
been proposed. 


« A compressive shape from focus method has been proposed where the fo- 
cal stack is compressively captured and the focus measure is reconstructed 


computationally Luoléc]. 


e A confocal array scanning principle has been proposed where the axial mea- 


surement range and the lateral array density are adapted iteratively [Luo18]. 


1.4 Thesis Outline 


e A confocal area scanning principle has been proposed based on a tilted il- 
lumination field [Luo20]. Theoretical analysis has been performed together 
with numerical simulation. 


« For localized confocal scan, a method based on Bayesian experimental de- 
sign has been proposed to improve the scanning efficiency. To acceler- 
ate the computation, a recurrent neural network has been developed and 


trained to approximate the process [[Luo17al. 


To validate the proposed system and methods, experimental evaluations have 
been conducted. 


« Performance of the programmable light source has been experimentally val- 
idated [Luo17c] and applied in optical unmixing [Luo16b]. 


« The Compressive Shape from Focus method has been implemented both in 
simulation and in practice, which is capable of generating raw 3D measure- 


ment with a minimum number of frames [[Luo16c, Luo17b]. 


e The iterative array adaptation method has been implemented in practice 
and evaluated through a series of experiments [[Luo18]. 


« The direct area scanning method has been implemented in practice and eval- 
uated through a series of experiments [Luo20]. 


« The RNN-accelerated Bayesian experimental design method has been eval- 


uated through simulation [Luo17a]. 


1.4 Thesis Outline 


This thesis is structured as follows: 


Chapter P: Related Work] This chapter offers an extensive survey of the ex- 


isting literature within the scope of this thesis, including tunable light source, 
the shape from focus technology and various confocal scanning methods. 


Chapter 3: Design and Construction of AdaScope| This chapter demon- 


strates the design of an adaptive microscope with axial chromatic encoding, 
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namely the AdaScope. The AdaScope system mainly consists of two compo- 
nents, the programmable light source and the programmable array chromatic 
microscope. Construction of each component is described in detail in their 
respective section. 


Chapter 4: Cascade Measurement Strategy] Based on the AdaScope plat- 


form presented previously, Chapter H discusses a cascade measurement strat- 
egy, which is formed by a series of measurement methods. The compressive 
shape from focus method is capable of rough measurement with a minimum 
number of frames, which is suitable as a pre-measurement step to limit the ax- 
ial measurement range. Based on the result from the pre-measurement step, 
the main measurement can be initialized, where two candidate methods are 
presented and analyzed, i.e., iterative array adaptation and direct area scan- 
ning with tilted illumination. Last but not least, a post-measurement refine- 
ment step has been proposed based on Bayesian experimental design, which 
can be further accelerated through a recurrent neural network. 


Chapter 5: Evaluation and Results In this chapter, a series of experiments 


have been implemented to evaluate the proposed system as well as the mea- 
surement methods. 


Chapter 6: Concluding Remarks Lastly, the result of this thesis is summa- 


rized while an outlook of future research is also presented. 
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The goal of this work is to construct an adaptive microscope with chromatic 
encoding, based on which a cascade of optical measurement methods can be 
developed and applied for surface profilometry. To provide a background, an 
extensive overview ofthe existing literature is presented in this chapter. These 
previous works are divided into three sections, which are related to different 
aspect of the thesis. Section [2.1] focuses on the development of a particular 
type of light source, whose output spectrum is tunable. As a vital component 
of the proposed system, the tunable light source is responsible for the axial 
scanning of the optical information acquisition process. Section P. gives a 
brief overview of the shape from focus technology. Based on a wide field 
microscope, this technology achieves 3D measurement of the target surface 
using a stack of images, which are acquired while the focal plane is shifted 
with respect to the target sample. Although relatively fast in terms of image 
acquisition, robustness of the shape from focus technologies depends strongly 
on the image processing techniques as well as the intrinsic textures of the 
object surface. Section =Æ explores the various confocal scanning methods 
in both the lateral direction and the axial direction. Compared to shape from 
focus technologies, confocal measurement methods are typically capable of 
higher resolution and accuracy while demanding a longer scanning process. 


2.1 Tunable Light Source 


The ability to modulate the spectrum of light serves as a powerful tool in 
various fields of research, including biomedical optics, optical communica- 
tion, hyperspectral imaging, optical measurement, etc. The variability of the 
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spectrum in such systems enables various analog signal processing methods 
which greatly improves the performance and generates new possibilities. For 
example, Hirai et al. developed a multispectral image projector using a pro- 
grammable spectral light source [Hir16]. By using multiple primary colors, 
the system is capable of wide-gamut projection. 


Earlier development of spectrum modulation is mainly driven by the need for 
wavelength division multiplexing (WDM) in optical communication as well 
as chemical analysis. Various technologies have been proposed and imple- 
mented to realize a tunable spectral filter, including acousto-optics [Hua96, 
Sapo2], liquid crystals Pat91l], fiber bragg grating [nuo], in- 
terferometer [Din11], etc. These methods typically focus on the realization of 
tunable bandpass filters, some of which with tunable bandwidth. Neverthe- 
less, they lack the capability of manipulating a complete spectrum. 


With the wide popularity of commercial Digital Light Processing (DLP) 
projectors, the digital micromirror device has been receiving an increasing 
amount of attention for the development of novel optical systems. Riza et al. 
introduced a digitally controlled multiwavelength programmable attenuator 
using a two-dimensional DMD [Riz99]. Based on this concept, a broadband 
optical equalizer was developed later [Rizo3]. Chuang et al. proposed a 
programmable light spectrum synthesis system which used collimated 
output from a single mode fiber as the primary source [Chuo6]. The light 
is dispersed onto a DMD chip with a diffraction grating. One dimension 
of the DMD is used for wavelength selection while the other dimension 
is used for intensity modulation of the corresponding wavelength. This 
concept has been commercialized using a Xenon arc lamp as the primary 
source and DMD for spectral filtering (OneLight Spectra by OneLight Corp. 
and OL-490 Agile Light Source by Optronic Laboratories). Although having 
been applied in surgical and biomedical research, these sources typically 
suffer from performance limitations due to the compromise between spectral 
resolution and efficiency. Wood et al. proposed to use a supercontinuum 
laser (also known as white light laser) as the primary source coupled with a 
prism as the disperser and a DMD as the modulator to construct a tunable 


light source [Woo12]. Diffraction analysis is made considering the DMD as a 
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blazed grating. The system is capable of producing illumination bands with a 
roughly constant width of 6 nm. 


2.2 Shape from Focus 


Depth estimation based on an imaging system has been a widely studied topic 
in the area of computer vision and image processing. Generally, existing 
methods can be classified into active methods and passive methods. Active 
methods involve the projection ofan optical probe onto the target scene, often 
in the form of a laser beam or an illumination pattern [Schi1]. The 3D profile 
of the target scene is reconstructed with the information in the scattering/re- 
flection of the optical probe captured by the imaging system. The requirement 
ofthe additional projection/illumination system will increase the complexity 
and the cost of the active methods, inevitably limiting their applicability. In 
situations where physical interaction with the scene is not allowed, passive 
methods are applied by taking images of the scene without specific additional 
illumination. Various depth cues in the captured images have been proposed 
by researchers, including stereopsis (Mar76], shading [|Zha99], focus [Nay94], 
etc., which are used to reconstruct the 3D information. In this section, the us- 
age of focus as a cue for depth measurement is discussed. 


Research focuses in this area are mainly placed upon two topics, i.e., the de- 
sign of robust focus measure operators and the development of estimation 
algorithms. Pertuz et al. made an extensive survey and comparison of 
popular focus measure operators for shape from focus. Apart from the oper- 
ators listed in the above survey, more complex operators are being developed 
constantly not only for shape from focus methods but also for sharpness es- 
timation as a more general topic, such as the S3 operator by Vu et al. Muz, 
which utilizes both spatial and spectral information in color images. 


Conventional estimation algorithms involve localizing the maximum focus 
position from the focal stack for each pixel. A widely accepted method 
is to take a Gaussian model as proposed by Nayar et al. [Nay94]. Alter- 


natively, other fitting methods have also been studied, such as quadratic 
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and polynomial fits [Sub95]. With the recent development of machine 
learning and optimization algorithms, more sophisticated methods have been 
proposed by breaking the isoplanatic restriction [Sub95], such as surface 
fitting and optimization by neural networks [Asioi], and total variation reg- 
ularization [Mah13]. Through adoption of a deep neural network, Hazirbas 
et al. [Haz18] proposed a SFF method, which provides an end-to-end solution 
for depth reconstruction from the focal stack. As can be seen from the listed 
literature, the design of the focus measure operator and the development 
of estimation algorithm are often conducted simultaneously in a holistic 
manner in order to improve the performance of the overall method. 


Unlike shape from defocus techniques, where the blur kernel is assumed 
known (e.g., [Favos]), SFF techniques generally require a minimum number 
of image samples along the focal axis in order to perform robust estimation. 
This is realized by either shifting the focal plane or changing the relative 
distance between the camera and the scene, while images are captured. 
When a large number of images is required, such shift/movement commonly 
leads to a slow measurement speed and bulky systems. Additionally, the 
large number of the images, which is needed for evaluation, adds to the 
data transfer and the computational cost. Despite the development of 
various image processing techniques for depth reconstruction in the previous 
decades, the way of image capture remains relatively unchanged. 


2.3 Confocal 3D Microscopy 


First invented by Minsky in the late 1950s (Mino 1}, confocal microscopy dif- 
fers from conventional wide-field microscopy by the fact that both the light 
source and the detector are filtered by a pinhole. Such confocal filtering not 
only improves the lateral resolution of the microscopic system, but also en- 
ables the system to be sensitive to the axial position of the sample. As a non- 
contact measurement technology, confocal microscopy has been widely ap- 
plied in various fields including surface profilometry, biomedical imaging as 
well as other applications, due to its unique capability of depth discerning. 
Nevertheless, to achieve complete 3D measurement of the target, confocal 
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microscopy in its raw form depends on the scanning of a single focused point 
both laterally and axially, which leads to a relatively slow measurement speed. 
To tackle this problem, a great amount of research effort has been invested in 
the past decades to accelerate the measurement speed of confocal microscopy. 


2.3.1 Confocal Lateral Scanning 


In the lateral direction, multiple methods have been invented to accelerate 
the scanning speed, which can be categorized into two different approaches: 
faster scanning of a single focal point and scanning of an array of focal points. 


Single Point Scanning 


Acceleration of single point scanning is realized through utilization of 
faster mechanical components. Conventional confocal scanning systems 
aim to generate a relative movement between the focal spot and the target 
sample either through physical movement of the system/sample, or through 
manipulation of the focal spot. With the development of opto-mechanical 
components, the speed of conventional scanning methods have also been 
improved over the years. Arrasmith et al. have developed a 2D confocal 
scanning system based on a MEMS bi-axial micro-mirror (Arr10]. By 
changing the angle of the illumination beam through controlled tilting of 
the micro-mirror, the focal spot of the optical system is effectively shifted. 
Through adoption of a micro-mirror which is electromagnetically actuated, 
the system is able to achieve SVGA resolution (800 x 600 pixels) at 56 
frames per second. Similarly, Liu et al. have developed a compact fiber-optic 
endoscopic probe, where a MEMS scanner is integrated to achieve confocal 


scanning [Liu14]. 


Apart from scanning achieved through a single tilting mirror, a high speed 
laser scanning confocal microscope has been presented by Choi et al. based 
on the combination of a fast rotating polygonal scanning mirror for the fast 
axis and a bi-directional scanning galvanometer-driven mirror for the slow 
axis [Cho13]. The proposed system is capable of an acquisition rate up to 200 
frames per second (512 x 512 pixels). 
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Based on measurement of a single focal point, methods categorized in the 
first approach typically hold the advantage that the light detection system 
and the data processing algorithms are relatively simple and straightforward. 
Additionally, application of MEMS components allows for miniaturization of 
the complete system [Liu1d]. Nevertheless, measurement speed as well as 
robustness of such systems is largely limited by the respective mechanical 
components. 


Array Scanning 


Although the scanning of a single focal point can indeed be accelerated 
through application of state-of-the-art opto-mechanical components, the 
majority of the research effort has been focused on the second approach, 
which aims to achieve a multitude of measurement points simultaneously 
through the same optical system. Instead of scanning a single focal point, 
an array of focal points are illuminated, detected and shifted. Xiao et al. 
serve as a pioneer by proposing a confocal scanning microscope based on a 
spinning Nipkow disk for real-time direct viewing [Xia88]. The pinhole array 
on the disk performs confocal filtering for both illumination and imaging, 
which greatly simplifies the setup compared to earlier tandem scanning 
microscopes. Later, Tiziani and Uhde use a similar setup with Nipkow disk 
for 3D measurements (Tiz94al). In some applications, dense lateral sampling 
is sometimes not necessary. Under these circumstances, a confocal sensor 
with a static array of measurement points is sufficient, which is denoted 
as a confocal matrix sensor, such as the one presented by Hillenbrand 
et al. [Hil15]. Apart from using a physical pinhole array, other methods 
have been developed to generate an array of measurement points, such as 
using microlens array [Tiz94b], fiber bundle and diffractive optical 
element [Hul12]. With the development of spatial light modulators (SLMs), 
the idea of a mechanically shifted point array evolved into an important 
field of research, i.e., programmable array microscope (PAM), where various 
kinds of SLMs have been utilized to generate a focal point array for confocal 


measurement, including digital micromirror devices Cha15], liquid 
crystals on silicon (LCoS) Kin14], and polymer-dispersed liquid crys- 
tals (PDLCs) [[Cha17]. Such systems allow for dynamic control of the point 
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array, which enhances the speed and the flexibility of the measurement. By 
simultaneously measuring multiple points laterally, the measurement time 
of the confocal microscope is greatly reduced. However, as the numerical 
aperture and the target depth range increase, measurement points become 
more vulnerable to the crosstalk from their adjacent points, thus demanding 
a larger minimum pitch distance. 


Line/slit confocal scanning can be seen as a special type of the array scan- 
ning method, where a continuous line of points are illuminated and imaged 
through a confocal system. Sheppard and Mao first presented a theoretical 
foundation for the slit scanning method by showing that the axial resolu- 
tion degradation of slit scanning compared to a single point is well within 
acceptable level, especially considering the great benefit of speed acceler- 
ation [hess]. Later, Sabharwal et al. have developed a miniaturized slit- 
scanning confocal setup for endoscopic measurement [Sab99]]. In [Pohos], 
Poher et al. demonstrate a slit scanning confocal microscope based on a 2D 
imaging sensor instead of a line detector, with which the blurred part of the 
slit image can also be captured. Through application of an advanced image 
processing technique of background subtraction, the system obtained an axial 
resolution even better than a point confocal system. In general, slit scanning 
confocal systems are more efficient than point scanning systems in terms of 
the scanning speed. Nevertheless, both the lateral resolution along the slit 
and the axial resolution are partly sacrificed. Additionally, while one lateral 
axis is covered by the slit, the orthogonal lateral axis still requires scanning, 
which is often performed mechanically. 


Spectrally encoded confocal methods in the lateral direction originate from 


early development of 2D image transmission by WDM [Bar79, Men97]. 


In [Tea98], Tearney et al. have proposed a special type of slit-scanning 


confocal microscope based on spectral encoding where a grating is utilized 
to disperse light of different wavelengths onto a line. Points on the line are 
detected simultaneously through measurement of the returned spectrum 
based on Fourier-transform spectroscopy. Compared to conventional slit 
scanning, the spectral dispersion provides additional confocal filtering along 
the slit direction. Pitris et al. have conceived a novel optical design based on 
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two prisms and a transmission grating, enabling better miniaturization of a 
spectrally encoded line scanning confocal probe [Pit03]. In [Bouo5], Boudoux 
et al. present a spectrally encoded confocal setup for 2D measurement, where 
one lateral axis is scanned through a rapid wavelength-swept laser and the 
other lateral axis is scanned with a galvanometer mounted mirror. Kim 
et al. have demonstrated a spectrally encoded slit confocal setup for direct 
2D measurement [Kimo6]. A physical slit is used to define one lateral axis, 
which is dispersed in the orthogonal direction so that the other lateral axis is 
encoded spectrally. An optical analyzer based on a 2D CCD camera is used 
to measure all lateral points simultaneously. In 2015, a confocal system based 
on the similar principle is constructed using a wavelength-swept laser and a 
line scan camera [Kim15]. To the author’s knowledge, spectrally encoded slit 
confocal microscopy is currently the only method which is capable of direct 
dense 2D confocal measurement without the necessity of lateral scanning. 
Such a system typically requires very complex illumination and detection 
optics and axial scanning is still required for complete 3D measurement. 


2.3.2 Confocal Axial Scanning 


In the axial direction, improvements of confocal scanning speed are mainly 
achieved through two approaches, i.e., adaptive lenses and chromatic confocal 
technology. 


Adaptive Lenses 


Apart from faster linear axes used for axial scanning, various confocal sys- 
tems recently utilize adaptive lenses to vary the focus of the system elec- 
tronically with a very fast speed, including adaptive lens based on micro- 
electro-mechanical systems (MEMS), electro-optic lens (EOL) and acousto- 
optic lens (AOL). 


Liu et al. have proposed a 3D confocal scanning microendoscope based on 
MEMS for both lateral and axial scanning [Liu1l4]. For axial scanning, a tun- 
able MEMS lens has been constructed, which consists of a MEMS lens-scanner 
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with a central opening and a 2.4mm diameter glass objective lens assem- 
bled onto the scanner platform. The central platform in the MEMS lens- 
scanner is symmetrically supported by four lateral-shift-free large-vertical- 
displacement (LSF-LVD) actuators which are driven by electrothermal actu- 
ation. In (Kou14], Koukourakis et al. have developed an adaptive lens based 
on Polydimethylsiloxan membrane with a piezo actuator for axial confocal 
scanning. With a single adaptive lens in the system, performance of the axial 
scan is often degraded due to defocus and spherical aberration introduced by 
the tuning of the adaptive lens. By employing a second adaptive lens in the 
detection path, aberrations are successfully balanced and homogeneous axial 
resolution can thus be achieved. 


Based on a different principle, Shibaguchi et al. have developed a Lead- 
Lanthanum Zirconate-Titanate (PLZT) electro-optic lens with variable focal 
length [Shi92]. In (Khaod], Khan and Riza have constructed a confocal system 
based on a tunable focus liquid crystal lens. Despite the relative fast tuning 
speed, an EOL typically suffers from the drawback of being polarization 
sensitive. 


Kaplan et al. are the first to demonstrate high-speed focus scanning using an 
acousto-optic lens [Kap01]. The lens consists of two adjacent acousto-optic 
scanners with counterpropagating acoustic waves that have the same fre- 
quency modulation but a x phase difference. Based on such a lens, a confocal 
profilometer is constructed, which achieves an axial scan of 400 kHz. Never- 
theless, acousto-optic crystals have a weak transmission efficiency, owing to 
their need of multiple crystals for focusing a beam. Recently, tunable acoustic 
gradient (TAG) index of refraction lenses have emerged as a new generation 
of high-speed tunable optical components, which are able to provide complex 
beam profiles. Mermillod-Blondin et al. have demonstrated the use of a TAG 
lens as a fast varifocal element [Meros]. Duocastella et al. have integrated an 
acousto-optic lens in a commercial confocal system to achieve axial scanning 
at 140 kHz [Duo14]. By using synchronized and high time-resolution detec- 
tion, simultaneous multiplane confocal imaging can be achieved. In [Szu18], 
Szulzycki et al. have developed an AOL operating at a focus tuning rate of 
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300 kHz, which is combined with a laser scanning confocal microscope for 
fast 3D imaging. 


Encoding of Axial Response 


Axial chromatic encoding has been widely applied in confocal systems, which 
replaces mechanical scanning by focusing light of different wavelengths onto 
different axial positions. Instead of the monochromatic light source used in 
conventional confocal systems, chromatic confocal technology requires the 
usage of a polychromatic light source for illumination. Decoding of the axial 
information has to be conducted by a wavelength-sensitive detector such as a 
spectrometer or through wavelength scanning. The first chromatic confocal 
system is an optical profilometer presented by Molesini et al. [M0184]. Lateral 
dispersion is used in the detection arm so that the spectrum of the reflected 
light can be measured with a photodiode array. Since then, various methods 
have been invented to improve the performance of chromatic confocal scan- 
ning. The application of better light sources is accompanied with new designs 
of the dispersion system to achieve better illumination. Advanced sampling 
mechanisms are developed to measure the spectral data with faster speed. 


Hutley et al. have developed a wavelength-encoded linear displacement trans- 
ducer based on a zone plate as the dispersing element [Hutss]. In [Tiz96], 
Tiziani et al. have manufactured an array of microlenses formed by zone plates 
to realize a chromatic confocal matrix sensor. Instead of a broadband light 
source, four semiconductor laser diodes with different wavelengths are used 
together with a CCD chip for intensity measurement. To improve upon the 
idea of using zone plates for dispersion, Dobson et al. have designed more 
complex diffractive optical elements for chromatic confocal imaging [Dob97]. 
Meanwhile, a tunable Ti-sapphire laser has been utilized as the light source, 
whose output wavelength is shifted electronically while the confocal inten- 
sity signal is recorded by the detector. Shi et al. have applied a supercon- 
tinuum light source based on non-linear effect of a photonic crystal fiber in 
a chromatic confocal microscope [Shios]. Axial measurement range benefits 
significantly from the broad bandwidth of the supercontinuum light source. 
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In Hilı2a]), Hillenbrand et al. provide comprehensive design strate- 


gies for hybrid hyperchromatic lenses based on the combination of diffrac- 
tive and refractive elements. The longitudinal chromatic aberration of the 
system is maximized for optimum axial measurement range in a chromatic 
confocal system. And in Hilı3b], Hillenbrand et al. present a spec- 
trally multiplexed three-point sensor using a segmented diffractive optical el- 
ement (DOE). As each segment ofthe DOE generate different axial chromatic 
dispersion, a single spectrometer can be used to retrieve the axial positions 
of three lateral points. In [Rayıa], Rayer and Mansfield have compared the 
performance of refractive, diffractive and hybrid aspheric diffractive optics 
for the application of chromatic confocal technology. The presented hybrid 
aspheric diffractive lens is able to combine the low geometric aberration of a 
diffractive lens with the high optical power of an aspheric lens, thus achieving 
better performance. 


In the aforementioned methods, decoding of the axial information is con- 
ducted through measurement of the reflected spectrum. The spectrum is sam- 
pled either with the illumination arm or the detection arm of the system. With 
illumination sampling, a tunable laser source is typically used whose wave- 
length is scanned while a broadband detector records the reflected intensity 
for each wavelength. Such process is time-consuming as each wavelength 
has to be measured consecutively. For sampling made directly with detec- 
tion, a spectrometer of a certain form is utilized to measure the reflected spec- 
trum of a broadband illumination source. Although multiple wavelengths are 
measured simultaneously, readout and transfer of the spectrometer data still 
limit the measurement speed of the system. In both cases, dense sampling 
of the complete spectrum is highly inefficient, as the signal can be consid- 
ered to be very sparse and the underlying parameter to be estimated is only 
two-dimensional, i.e., surface reflectance and axial position. The low intensity 
wavelength positions suffer from a low signal-to-noise ratio (SNR) mainly due 
to the photon noise, providing very little additional information. In practice, 
such noise could even have an adverse effect on a naive Gaussian fitting pro- 
cess. As an example, for a fiber-based chromatic confocal sensor, Luo et al. 
have shown that Gaussian fitting performed only for wavelengths with higher 
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intensity achieves a slightly higher sensitivity compared to Gaussian fitting 


conducted for all wavelengths [Luo12]. 


Therefore, to achieve more efficient sampling of the axial confocal signal, 
multiple methods have been developed to linearly map the high dimensional 
spectrum data into a lower dimensional space using spectral filters in either 
illumination or detection. In (Jon93}], Jones and Russel have first defined the 
idea of wavelength discrimination systems and chromatic discrimination sys- 
tems, which can be applied to chromatic confocal sensing as well. Without 
measuring the reflected spectrum exactly, Tiziani and Uhde have proposed 
a chromatic confocal microscopic system where three color filters are used 
with a CCD camera to generate color images of the sample [Tiz94a]. This 
detection setup is intrinsically equivalent to a spectrometer with very low 
resolution and the depth information is therefore encoded in the RGB color. 
This idea naturally extends to applications of multispectral and hyperspec- 
tral cameras. Kim et al. have proposed a chromatic confocal system with a 
wavelength detection method using transmittance [Kim13]. Two detection 
channels are implemented with two photomultiplier tubes, where one channel 
records the total reflected intensity and the other channel records the intensity 
of the reflected light after filtering through a color filter. In [Tap13], Taphanel 
et al. provide a more complex design methodology for the development of a 
multi-channel chromatic confocal detection system using six interference fil- 
ters. Physical structure of the interference filters are directly optimized based 
on a merit function related to the uniqueness and sensitivity of the chromatic 
axial measurement. 


Apart from chromatic encoding, Lee et al. have developed a confocal sys- 
tem based on direct mapping of axial information onto two intensity chan- 
nels [Lee1d], which shares some similarity to the idea of chromatic confocal 
detection with transmittance by Kim et al. [Kim13]. Instead of using spec- 
tral filters, two pinholes with different sizes are placed in front of the two 
detectors, generating two different axial response curves, which can be used 
to encode the axial information. 
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2.3.3 Confocal 3D Scanning 


To perform complete 3D confocal scan, methods presented in Section 


onfocal Lateral Scanning} and Section 2.3.2: Confocal Axial Scanning] are 


combined in a single system. 


For example, Cha et al. have used chromatic axial encoding together with dy- 
namically configurable micromirror scanning to achieve nontranslational 3D 
profilometry [Cha00]. In [Lingg], Lin et al. have applied chromatic encoding 
to a slit-scan confocal system based on a diffractive lens as the dispersing ele- 
ment. A 2D CCD imager is coupled with a spectral grating to achieve single- 
shot 3D measurement of a line. Chen et al. have used a DMD to generate a 
scanning array offocal points while adopting chromatic encoding for the axial 
measurement [Che11]. Hillenbrand et al. have presented a chromatic confo- 
cal matrix sensor based on a pinhole array [Hil13a]. And later an actuated 
pinhole array is adopted to achieve complete 3D scanning (Hil15). Joeng et 
al. have combined a direct-view confocal microscope with an electrically tun- 
able lens to achieve a matrix sensor for 3D scanning [Jeo16]. And in [LEE18], 
Lee et al. have developed a 3D confocal microscope based on a dual-detection 
method for axial measurement and a DMD for lateral array scanning. 
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In this chapter, an adaptive chromatic confocal microscope is presented as 
the result of a holistic design approach with the target of fast 3D surface pro- 
filometry, namely the AdaScope. The AdaScope system is composed of two 
major components, i.e., a programmable light source and a programmable 
array microscope. Section introduces the design and development of 
the programmable light source. Calibration and performance of the pro- 
grammable spectrum generation are also described in detail. In Section 
a programmable array microscope with chromatic encoding is presented. 
Finally, the development of the AdaScope is summarized in Section B.3} 


3.1 Programmable Light Source 


Illumination systems with tunable spectrum have been receiving increasing 
amount of attention due to their wide applications and unique capability. In 
the AdaScope system, the programmable light source with fast response and 
accurate spectrum reproduction serves as the foundation for various axial 
scanning methods. 


With the constant development of DMD, the binary pattern rate has been 
increased over the years, allowing for time multiplexing schemes to be devel- 
oped for real-time applications. Therefore, the presented system utilizes the 
complete two-dimensional area of a DMD for wavelength selection in order to 
achieve superior spectral resolution, while intensity modulation is realized in 
atime multiplexing fashion. The system is designed and constructed based on 
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the orthogonally placed combination of a prism and an echelle grating to gen- 
erate the echellogram of a supercontinuum laser onto the DMD. A complete 
calibration procedure is developed and implemented. Several spectra are gen- 
erated and analyzed, indicating a minimum FWHM of less than 1nm. When 
acting as a scanning bandpass filter, the wavelength tuning resolution can 
reach as small as 0.01nm. The proposed filtering system can be constructed 
with relatively low cost and easily attached to commonly available supercon- 
tinuum lasers to generate illumination with programmable spectrum. 


3.1.1 Design and Simulation 


The design of the programmable light source is based on the theoretical analy- 
sis of the 2D dispersion from the prism and the echelle grating, as well as an 
optical simulation using the OpticStudio software by Zemax. 


3.1.1.1 Theoretical Background 


A diffraction grating can be characterized with the grating equation [Sos11]: 


d(sin Our + sin On) = mA (3.1) 


where m stands for the diffraction order, A represents the wavelength of the 
light, d represents the grating period, n and Oyy represent the incidence 
angle and diffraction angle respectively. For a blazed grating, the blazing angle 
gp defines the angle between the facet normal and the surface normal of the 
grating, as shown in Figure The blazing angle of the grating is used to 
redirect energy to a certain order of diffraction for better efficiency. 


An echelle grating is a special type of blazed grating characterized by a large 
blazing angle of the grooves and used at very large diffraction orders obtaining 
strong dispersion. Since an echelle grating is typically installed in a Littrow 
configuration under blazing condition (Figure b.d, the angular dispersion can 
be written as: 
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-m-th order 


Figure 3.1: Schematic of a blazed reflection grating. 


dur m tan dp 
A Adcos Obi A 


(3.2) 


Although the groove density of the echelle grating is smaller than common 
blazed gratings, the groove structure is optimized for much larger blazing an- 
gle and therefore the light is concentrated into much higher diffraction orders. 
As shown in Equation 8.2} in a Littrow configuration under blazing condition, 
the dispersion of the grating depends only on the blazing angle of the grating 
and the wavelength, which allows the echelle grating to have much higher 
dispersion than normal blazed grating. 


dp _ m-th order 
On = OUT 


Figure 3.2: Schematic of an echelle grating under Littrow configuration. 
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The free spectral range AAgsp defines the largest bandwidth in one order that 
does not overlap with the same bandwidth in adjacent orders: 


w. À 
AÀFsR = = = 


3.3 
m+1 on 
where A, and Aı represent the shortest and longest wavelengths in the m-th 
order respectively. 


For an echelle grating, due to the large blazing angle, higher numbers of orders 
are utilized, which leads to very small free spectral range in each order. This 
also means that multiple orders will overlap at the same diffraction angle, 
making it necessary to apply a secondary disperser in the orthogonal direction 
in order to separate the orders from each other. Such secondary disperser, also 
known as a cross disperser, can be placed before or after the echelle grating. 


3.1.1.2 Optical Design 


The system is firstly treated as an echellogram system and designed in the 
sequential mode of OpticStudio. As illustrated in Figure b3 the echellogram 
system is composed of five components. A supercontinuum laser serves as 
the light source of the system. The target wavelength ranges from 470 nm 
to 700 nm. The laser beam first enters an equilateral dispersion prism made 
of F2 glass to be dispersed in the horizontal direction. The incidence angle 
with respect to the normal of the prism surface is 39°. The output beam is 
immediately incident on an echelle grating in the Littrow configuration under 
the blazing angle for vertical dispersion. The echelle grating has a blazing 
angle of 63.5° and a grating period of 31.6 grooves/mm. Such a combination 
ofthe prism and the echelle grating is chosen so that the echellogram covers 
the entire area of the DMD. Finally, the two-dimensionally dispersed light is 
focused onto the DMD through an achromatic doublet with a focal length of 
100 mm. 
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Figure 3.3: System schematic of the echellogram system in the sequential mode of OpticStudio. 


To demonstrate the orthogonal dispersion generated by the two dispersers, 
Figure B.4| presents the footprint diagrams at two apertures: the echelle grat- 
ing surface (left) and the first surface of the achromatic doublet after the 
echelle grating (right). The light beams with wavelengths of 468.1 nm and 
472.1 nm at the 119th order and 697.6 nm and 706.6 nm at the 78th order are 
illustrated and colored by their respective wavelength. As can be seen on the 
left, after the prism and before the echelle grating, light is only dispersed in 
the horizontal direction. After the echelle grating, vertical dispersion is intro- 
duced as shown by the diagram on the right. 


A program is written in Python with PyZDDE package to automate 
the computation of diffraction orders and the setting of multi-configuration. 
Totally 42 configurations are utilized, each representing an order in the range 
from the 78th order to the 119th order. Within each order, 21 wavelengths 
are specified, equidistantly covering the free spectral range in that order. The 
system is telecentric and it has been optimized in terms of the spot size, which 
is contained within the Airy disk, offering diffraction limited optical quality. 
As an example, Figure B.5| illustrates the spot diagram of 558.6 nm at the 99th 
diffraction order. 


29 


3 Design and Construction of AdaScope 


472.1nm 706.6nm 


472.1nm 706.6 nm 


N 


468.1nm 


96 mm 


Projection Lens 


Echelle Grating 


e 


46 mm 25.4 mm 


Figure 3.4: Footprint diagrams of the 119th order (blue) and the 78th order (red). Left: before 
the echelle grating surface. The green square shows a zoomed-in illustration. Right: 
after the echelle grating and before the achromatic doublet. 


To characterize the spectral resolution of each DMD pixel, the diffraction en- 
squared energy fraction is calculated for a specific position while varying the 
wavelength to generate the spectral response of the underlying pixel. The 
pixel is placed at the centroid of the focused spot at 558.6 nm. The pupil sam- 
pling resolution is specified as 512x512 and a wavelength range from 558.5 nm 
to 558.7 nm is investigated. The goal is to characterise the bandwidth of light 
falling on this particular pixel. As illustrated by Figure b.d the spectral re- 
sponse of one pixel with width of 7.6 um has an FWHM of 0.065 nm. Although 
this value varies with respect to the wavelength, the order of magnitude re- 
mains the same. When a larger area of 10 x 10 pixels is investigated, the 
resulting spectral width is slightly increased to 0.108 nm. As the size of the 
airy disk is relatively large with respect to the DMD pixel, diffraction effect 
is dominating the calculation of the ensquared energy fraction. This leads to 
the non-linear increase of spectral width when the detector area is increased. 
Since these values represent the optimum situation in theory, the spectral 
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Figure 3.5: Spot diagram of 558.6 nm atthe 99th diffraction order. The focusing spot on the DMD 
plane is shown in purple and the Airy disk is shown by the black ellipse in the graph. 


widths will be further widened in practice due to imperfect alignment and 
tolerances of the optics. 


3.1.1.3 Non-sequential Simulation 


Once the optimization is complete, the system is reimplemented in the non- 
sequential mode of OpticStudio to simulate the generation ofthe echellogram 
and the effect ofthe DMD. Each pixel ofthe DMD has two stable states, namely 
“on” and “off”, where the micro mirror is turned by 12° and -12° respectively. 
Although the DMD in practice has 1920 x 1080 pixels, due to internal speed 
limitations of OpticStudio, the DMD used in simulation is set to 192 x 108 
pixels. As illustrated in Figure B.7 several additional components are placed 
compared to the echellogram in the sequential mode. When the pixels of the 
DMD are turned on, a coupling lens is used to focus the reflected light into 
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a liquid light guide. When the pixels of the DMD are turned off, the light 
is deflected into a beam trap. Therefore, by sending a specific pattern to the 
DMD, certain wavelength components can be selected and coupled into the 
light guide for further applications. 


Distribution of the light on the DMD is simulated through ray tracing in the 
non-sequential mode. Totally 400 wavelengths are specified for each diffrac- 
tion order and are divided into 20 groups. The spectral intensities of all wave- 
lengths are assumed to be equal. Ray tracing is conducted for each group 
with 40000 rays. A detector color object with 1920 x 1080 pixels is placed 
right above the DMD to collect the projected rays. Another Python program 
is written to automate the switching between groups and orders. Figure 
illustrates the simulation result. 


Ensquared Energy Fraction 
Ensquared Energy Fraction 


0> <= 0 
558.5 558.6 558.7 558.5 558.6 558.7 
Wavelength / nm Wavelength / nm 


Figure 3.6: Spectral resolution characterized by the diffraction ensquared energy. Left: calcu- 
lated with one pixel. Right: calculated with a block size of 10 x 10 pixels. 
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Figure 3.7: System schematic of the programmable light source in the non-sequential mode of 
OpticStudio. Upper: all pixels are on. Lower: all pixels are off. 
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Figure 3.8: Simulated echellogram upon the DMD chip based on non-sequential ray tracing. 


A detector is placed at the focus of the coupling lens to investigate the inten- 
sity distribution at the entrance of the light guide. As shown in Figure B.9, 
when all pixels ofthe DMD are turned on, the reflected light forms an irregu- 
lar spot through the coupling lens onto the detector. In practice, a light guide 
with a diameter of 5 mm is utilized to collect as much light as possible, which 
is shown as the red circle in Figure B.Jl. 


Two things should be noted regarding the design and simulation implemented 
in OpticStudio. Firstly, the grating efficiency is not taken into consideration 
during the simulation. For each order, the energy is assumed to be distributed 
evenly within the free spectral range. In practice, according to characteristics 
of the echelle grating, most of the energy will be concentrated around the 
blazing angle, i.e., within the free spectral range of each order. Nevertheless, 
a small portion of the energy will also spread to other directions, partly due 
to the imperfect blazing facet structure. Secondly, only geometric ray tracing 
is conducted when investigating interaction between light and DMD. In real- 
ity, considering the small size of the DMD pixel, the diffraction effect cannot 
be ignored [Woo12]. With its periodic structure, the DMD acts like a blazed 
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Figure 3.9: Irradiance distribution at the entrance of the liquid light guide. The red circle indi- 
cates the aperture of the light guide which has a diameter of 5 mm. 


grating with a switchable blazing angle, which generates multiple 2D diffrac- 
tion orders instead of simple reflection. Therefore, simulation of the intensity 
distribution in Figure B.9} is only a rough approximation of the real scenario. 


3.1.2 Setup and Alignment 


The primary light source in the system is an obsolete model of supercontin- 
uum laser from Koheras (now NKT Photonics). It is similar to the SuperK EX- 
TREME EXW-12 model from NKT Photonics, which delivers 1.2 W of power 
in the range from 350 nm to 850 nm. As shown in Figure the collimated 
output of the supercontinuum laser is first expanded with a 4x reflective beam 
expander (BE04R/M from Thorlabs). The expanded beam then gets reflected 
by a visible mirror so that the infrared component passes through the mirror 
and enters a beam trap. The reflected beam passes through the prism and gets 
dispersed in the horizontal direction. As mentioned in the previous section, 


35 


3 Design and Construction of AdaScope 


the echelle grating has a groove density of 31.6 groovesmm! and a blazing 
angle of 63.5°. The grating is manufactured by Richardson Grating Lab us- 
ing Zerodur for substrate and aluminum for coating. After vertical dispersion 
generated by the echelle grating, the two-dimensionally dispersed light is fo- 
cused onto the DMD with an achromatic doublet (AC254-100 from Thorlabs). 
The DMD used in the system is DLP LightCrafter 6500 EVM from Texas In- 
strument. The DMD chip has 1920 x 1080 pixels with a pitch of 7.6 um. Light 
of wavelengths corresponding to pixels that are turned on is collected by a 
second achromatic doublet with a shorter focal length (AC254-30 from Thor- 
labs) and coupled into a liquid light guide with a diameter of 5 mm. A second 
beam trap is placed in the opposite position to collect light with wavelengths 
corresponding to pixels that are turned off. 


Beam Expander 


VIS Reflective Mirror \ Supercontinuum Laser 


= = N LA 


Achromat #1 


Beam Trap Achromat #2 DMD & Controller 
Figure 3.10: System setup of the programmable light source within its encapsulation. 
Although the system is composed of relatively few components, the alignment 


proves to be not trivial. To begin with, laser safety is a major issue throughout 
the alignment process. As a Class 4 laser, the supercontinuum laser utilized 
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in the system is not eye safe even when operated at 1% of power. At earlier 
stage of calibration as well as under scenarios where higher power is required, 
an augmented reality setup based on Oculus Rift and a webcam is utilized. A 
Python program with the OpenCV package feeds image from the web- 
cam to the Oculus Rift with proper distortion correction, in order to avoid any 
contact between the eyes of the operator and the laser. In other situations, a 
laser goggle with IRD5 filter from NoIR LaserShields is used and an additional 
neutral density (ND) filter is applied to the laser. Once alignment is finished, 
the complete system is encapsulated with a cage made of anodized aluminum 
rails and black cardboard, so that the calibration can be made without further 
laser safety protection. In general, alignment with a supercontinuum laser is 
always difficult, since a proper laser goggle with sufficient filtering will render 
the environment and non-illuminating components too dark. 


Secondly, as shown in Figure the positions of the prism and the echelle 
grating are designed so that usage of the DMD area is optimized. In practice, 
taking the grating efficiency into consideration, the incidence angle at the 
entrance of the prism has to be modified together with all following compo- 
nents to find the optimum. What adds to the complexity of alignment is that 
the spectrum varies with the power of the supercontinuum laser. In particu- 
lar, shorter wavelengths gain more power as the total power is increased, due 
to the nonlinear effect of supercontinuum generation. As more subtle adjust- 
ments are made using laser goggles as protection, which requires low power 
operation, the DMD space for shorter wavelengths can only be estimated and 
reserved. Since the complete measurement of the echellogram is only possi- 
ble at the calibration stage, the circle of alignment and calibration has to be 
repeated several times before an optimum situation is found. 


Thirdly, although the prism and the grating are designed to be placed side by 
side, as shown in Figure BA the edge of the ruling area of the echelle grating 
exhibits lesser quality than the center area in practice, which may lead to 
deterioration of the SNR in the final system. Therefore, the echelle grating is 
moved to the left of the prism to allow for the usage of center area. Although 
angular positions of the orders remain the same, the spatial positions are more 
separated at the projection lens, which adds more spherical aberration to the 
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edge orders. Such aberration is considered acceptable since the spot size is 
well below diffraction limit, as shown in Figure B.5 


Last but not least, the DMD exhibits very strong diffraction effects, ie. the 
reflection is composed of multiple two-dimensional orders. The rotation of the 
DMD and the position of the coupling lens has to be manipulated iteratively to 
have maximum collected power while maintaining acceptable spot size upon 
the DMD. 


3.1.3 Spectrum Generation 


The spectrum of the programmable light source is controlled with the pattern 
displayed on the DMD, which reflects the desired part of the wavelengths 
into the liquid light guide for output. The DMD is only capable of displaying 
binary patterns, since each micro mirror has only two stable tilting angles, 
which correspond to the state of on and off. Therefore, the intensity of each 
wavelength has to be manipulated through time multiplexing, i.e., multiple 
DMD patterns are combined to generate a spectral energy distribution within 
a certain period of time. Such a process is analyzed and discussed below. 


To begin with, the temporal spectral flux of the programmable light source a 
can be expressed as 


B,(At) = I Ta(Xa Yast) Ea (Xa: Yas) dx, dya, (3.4) 
D 


a 


where r, represents the temporal reflectivity of the spectral DMD and E, de- 
notes the spectral flux density. The lateral domain of the spectral DMD is 
represented by D, and the lateral coordinates of the spectral DMD are de- 
noted by x, and y,. 


For a certain duration of time, in which the reflectivity of the DMD changes, 
the average reflectivity of the spectral DMD can be defined as 
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Ta 
1 
Ra(Xa:ya) = T | Ta(Xa Yat) dt, (3.5) 
a 
0 


where T, denotes a certain duration of time. 


Based on Equations b4 and EI the spectral energy distribution of the gener- 
ated light within a duration of T, can be expressed as 


Ta 
Q(A) = | P(A,t) dt 


Ta 

-| | Ya( Xa, Vast) Ea(X4: 2,4) dxa dy, dt 
0D 

Ta 


D 


(3.6) 
=f (X Yast) dt Ea (X2 Ya,4) dx, dya 
0 

-1, |] Ry(%a,Va) Ea(Xa»YaA) dxa dya- 
D, 


Since the DMD is composed of discrete micro mirrors in practice, Equation b.d 


can be rewritten in a matrix form: 


q = Eara (3.7) 


where q € R™ represents the generated spectrum and r, € R" represents the 
required average reflectivity of the DMD pixels, which is reshaped into a 1D 
vector. The matrix E, € R"*”a contains the spectral responses of all DMD 
pixels: E, = le, e en, |> where e; represents the spectral flux of the 
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i-th pixel. For a given spectrum, the required reflectivity ofthe DMD can be 
calculated by solving the following nonnegative least square problem: 


minimize  |E,r, - ql> 
Ta (3.8) 
subject to r, = 0, 


where |-|, represents the Euclidean norm. 


Nevertheless, solving the problem with optimization algorithms can be very 
time consuming, especially considering the size of the calibration matrix. 
Therefore, the condition of nonnegativity is ignored and the problem is 
approximated by the following: 


minimize |r|, 
iS (3.9) 
subject to q=E,r,, 


which attempts to find the the least square solution with minimized norm. 
The solution can be easily calculated by applying the pseudo-inverse matrix 
of E, to both sides of the linear equation: 


r4 = Eġq = E\(E,ED Tg (3.10) 


Since the FWHM of the fitted Gaussian in the calibration matrix is very nar- 
row, the row rank of the calibration matrix is very close to full. Therefore, 
the solution by applying the pseudo-inverse matrix tends to be nonnegative 
in most of the cases. In rare circumstances where the reflectivity has negative 
values, the negative values are clipped to zero. As the pseudo-inverse matrix 
can be calculated off-line and pattern generation requires only one matrix 
multiplication, this method is much faster than solving the nonnegative least 
square problem rigorously with optimization and still provides acceptable re- 
sults. 
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3.1.4 Calibration and Results 


The calibration process aims to characterize the spectral response of each pixel 
so that any arbitrary spectrum can be generated by calculating the corre- 
sponding DMD pattern. The target is to generate the matrix E, presented 
in the previous section. 


3.1.4.1 Calibration Method 


As shown in Figure B.11| an additional setup is built outside of the encapsu- 
lated programmable light source. The liquid light guide is routed out of the 
encapsulation for calibration. Output light from this end of the liquid light 
guide is first collimated through an aspheric condenser lens (ACL5040U from 
Thorlabs), and then passes through a pair of microlens arrays (#63-230 from 
Edmund Optics) for homogenization. Lastly, an achromatic doublet is used to 
project the light to a rectangular area, where a fiber is placed which leads to 
a spectrometer. A special mounting adapter is machined in-house to hold all 
components at the correct positions. The combination of the condenser, the 
microlens arrays and the achromat is selected to match the diameter and the 
numerical aperture of the liquid light guide output, so that most ofthe energy 
is uniformly concentrated in a central rectangular block of 16mm x 12 mm, 
with minor portion of energy leaked to adjacent blocks. The homogenized 
rectangular illumination can be directly used in various applications once the 
calibration is finished. 


To make the calibration, firstly all DMD pixels are turned on. The position 
of the fiber end is aligned so that the overall intensity of the spectrum is the 
highest. Then each pixel of the DMD is turned on while the corresponding 
spectrum is recorded. Due to the limitation of the DMD speed, the signal-to- 
noise ratio of the spectrometer and the computer storage, instead of scanning 
all pixels individually, square blocks of pixels are grouped together to form 
macro pixels, which are scanned and measured in practice. The size of the 
block is chosen to be 5 x 5 pixels, in order to balance the calibration speed 
with the resolution. The integration time of the spectrometer is set at 5 ms. 
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Liquid Light Guide 
Microlens Array Fiber Spectrometer 


Condenser Lens - Achromat 


Figure 3.11: Calibration setup with homogenized rectangular illumination area projected onto 
a fiber leading to the spectrometer. 


And the average intensity from five measurements is recorded for each macro 
pixel. The spectrometer used for calibration is the HR2000+ model from Ocean 
Optics, which covers the range from 190 nm to 1100 nm with a resolution of 
0.66 nm and a step size of 0.44 nm. 


Once the spectra corresponding to all macro pixels are measured, the wave- 
length range is cropped to reduce the computational effort and the intensities 
are assembled into a 2D array, where one axis represents the wavelength and 
the other axis represents the location of the macro pixel. Firstly, an intensity 
mask is generated by applying a 1D median filter along the axis of location, 
which is subtracted from the original array in order to remove fixed pattern 
noise ofthe spectrometer and intrinsic drift ofthe supercontinuum laser spec- 
trum. The 2D array is then reshaped into a 3D hyperspectral cube, where two 
axes represent the position ofthe macro pixel on the DMD, and one axis rep- 
resents the wavelength. Afterwards, 2D Wiener filtering is applied to each 
wavelength layer for adaptive noise removal. The filtered hyperspectral cube 
is denoted by E3p and is reshaped back into a 2D array. After spectra with 
maximum intensity lower than the predefined threshold are discarded, the 
rest of the spectra are fitted to a Gaussian peak. All Gaussian peaks are com- 
bined into a 2D calibration matrix E,, where the row index of E, represents 
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the wavelength and the column index of E, denotes position of the macro 
pixel. 


3.1.4.2 Calibration Result 


During the calibration procedure, a 3D hyperspectral tensor Ezp € R™*"**"y 
is generated containing measured spectra for all macro pixels of the DMD, 
where ny and ny represent the columns and rows of the macro pixel. By com- 
bining the wavelength layers in the tensor Ezp, the echellogram upon the 
DMD can be synthesized. As shown in Figure the actual echellogram 
is rotated with respect to the simulated result shown in Figure B.8| due to the 
rotation of the DMD coordinate system in the alignment process. Although 
most of the energy are concentrated within the free spectral range of each 
order, a small part of the energy gets leaked out of the free spectral range. 
Therefore, for certain wavelengths, the energy gets distributed into two/three 
different orders/locations. The arc stripe where intensity is slightly reduced in 
the lower part of the synthesized echellogram might be caused by the groove 
structure anomaly of the echelle grating. It should be noted that the resolu- 
tion of the echellogram synthesis is limited by the spectral resolution of the 
spectrometer. To be more specific, a spectrometer with better spectral reso- 
lution is capable of generating sharper synthesized echellogram. In general, 
the prism is able to separate the orders well enough as expected and the area 
of the DMD is fully utilized. 


Although the tensor E3p can be reshaped to directly generate the target ma- 
trix E, by flattening the two lateral dimensions, in the practical calibration 
procedure, the measured spectral peak of each effective macro pixel is fitted 
to a Gaussian function to reduced the noise in the signal. Macro pixels with 
a maximum intensity smaller than a threshold are discarded due to low SNR. 
In Figure the relationship between the FWHM and the peak wavelength 
of the fitted Gaussian is illustrated by plotting the fitting result of each macro 
pixel as a dot. The average FWHM of the fitted Gaussians is 1.02 nm, which in- 
dicates that the calibration resolution is very likely limited by the spectral res- 
olution of the spectrometer (approx. 0.66 nm) as well as its pitch size (0.44 nm). 
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Figure 3.12: Synthesized echellogram upon the DMD chip generated from spectral measure- 
ments ofthe scanning macro pixels. 


Therefore, the actual peak width of the spectrum corresponding to one macro 


pixel is potentially smaller than the currently measured value. 
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Figure 3.13: FWHM with respect to center wavelength of the fitted Gaussian peaks. Measure- 
ments corresponding to 54338 macro pixels are drawn. 
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The amplitude of the fitted Gaussian peaks are also plotted in Figure 
Certain periodic variation of the maximum amplitude can be observed from 
the figure. This is due to the fact that not all energy is concentrated within 
the free spectral range of each order. For wavelength at the center of the free 
spectral range of one diffraction order, most energy is concentrated in the 
corresponding macro pixel, thus achieving higher maximum amplitude for 
the fitted Gaussian function. For wavelengths at the edge of the free spectral 
range, part of the energy will be distributed to the adjacent order, making the 
maximum amplitude for this wavelength lower. 
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Figure 3.14: Amplitude with respect to the center wavelength of the fitted Gaussian peaks. Mea- 
surements corresponding to 54338 macro pixels are drawn. The amplitude is shown 
in count, which is the native unit of the spectrometer. 


To evaluate the effect of the calibration, especially the fitting process, all fit- 
ted Gaussian signals of the effective macro pixels are summed and compared 
against the total spectrum measured when all pixels are turned on. The to- 
tal spectrum is measured using a much smaller integration time in order to 
avoid saturation of the spectrometer, and is therefore scaled according to the 
ratio of integration times. As can be seen in Figure the sum of the se- 
lected macro pixels is quite close to the spectrum when all pixels are turned 
on. In fact, only 22% of the macro pixels are selected for calibration, which 
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contributes to the majority of the energy, whereas the rest of the pixels are 


discarded for their poor signal-to-noise ratio. 
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Figure 3.15: Comparison between measured total spectrum and calculated spectrum by sum- 
ming all useful macro pixels. 


Once the average reflectivity is calculated based on Equation it is re- 
shaped into the complete DMD size and quantized into an 8-Bit pattern, which 
can be displayed through time multiplexing. DMD patterns for several target 
spectra are calculated and the generated spectra are measured with an inte- 


gration time of 5 ms. 


Figure illustrates the generation of a flat spectrum spanning across the 
range from 480 nm to 700 nm. Although with some variation of the intensity 
due to intrinsic spectral variation of the supercontinuum laser and noise, the 
generated spectrum is very close to the target. The overshoot on the edges of 
the flat spectrum can be explained by the Gibbs phenomenon, as the maximum 
frequency that the system is capable of generating is dependent on the FWHM 
of the fitted Gaussian for each macro pixel. 
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Figure 3.16: Comparison between measured flat spectrum and target spectrum. 
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Figure 3.17: Comparison between measured ramp spectra and target spectra. 


Figure demonstrates two ramp spectra from 480 nm to 680 nm. Similar 
to the flat spectrum, overshoot can be observed during abrupt change at the 
edges while the rest of the spectra closely reproduce the targets. 
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Figure 3.18: Comparison between measured Gaussian spectrum and target Gaussian spectrum. 


Last but not least, Figure shows a Gaussian spectrum centered at 580 nm 
with a FWHM of 10 nm. Gaussian peaks with narrower FWHM can be gener- 
ated as well and the smallest possible FWHM is limited by the FWHM of the 
spectrum generated by a single macro pixel. As mentioned previously, the 
average FWHM of the fitted Gaussian for each macro pixel is 1.02 nm, which 
is very likely limited by the resolution of the spectrometer in the current cal- 


ibration setup. 


3.1.4.3 Wavelength Tuning Resolution 


As the spectrum of the system can be programmed arbitrarily, the combi- 
nation of the echelle grating and the DMD can also be treated as a tunable 
band pass filter, achieving a function similar to an acousto-optic tunable fil- 
ter (AOTF) which is often applied to the supercontinuum laser to realize wave- 


length scanning. 


To characterize the wavelength tuning resolution of the current system, a se- 
ries of Gaussian spectra are generated, each having a FWHM of 2nm. The 
center wavelength of these spectra ranges from 560 nm to 580 nm with a step 
size of 0.01nm. The calibration setup mentioned previously is used to record 
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Figure 3.19: Scanning of Gaussian spectra with FWHM of 2nm. Left:wavelength range from 
540 nm to 560 nm. Right: zoomed wavelength range from 549.9 nm to 550.1 nm. 


the generated spectra with an integration time of 5ms. Gaussian fitting is 
then applied to all recorded spectra to yield measured center wavelengths, 
which are compared against the target. As shown in Figure the mea- 
sured scanning results demonstrates superior linearity with very few errors. 
When zoomed into a smaller region of 0.2 nm, it can be seen that a step size 
of 0.01 nm can be correctly reflected by the measurement with minor errors 
caused mainly by noise. As an example, Figure illustrates five measured 
spectra from the scanning sequence. 


3.1.5 Discussion 


In this section, a novel programmable light source in visible range is pre- 
sented, which utilizes a supercontinuum laser as the primary source and com- 
bines its echellogram with digital mirror device for programmable spectral 
filtering. The echellogram is firstly designed in the sequential mode of Optic- 
Studio to have diffraction limited telecentric imaging upon the DMD. Then 
the complete system, including the DMD and the coupling of the output light 
into a liquid light guide, is simulated in the non-sequential mode with ray 
tracing to generate an echellogram image showing the free spectral range of 
diffraction orders No.78 to No.119. 
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Figure 3.20: Five Gaussian spectra from the scanning sequence. The fitted center wavelength is 
represented by A.. 


The system is constructed in the lab and evaluated. To calibrate the pro- 
grammable filter, the output of the liquid light guide is homogenized and 
projected onto a fiber spectrometer. Pixel blocks of 5 x 5 are scanned while 
their corresponding spectral responses are recorded. The recorded data are 
cleaned, smoothed and fitted with several processes before the calibration ma- 
trix is constructed by assembling fitted Gaussian spectra of the useful macro 
pixels. Average FWHM of the fitted Gaussians is 1.02 nm, which is believed 
to be limited by the resolution and the step size of the spectrometer. During 
the calibration procedure, intensity measurements of different wavelengths 
can be combined to synthesize the echellogram upon the DMD. Two major 
differences are observed between measured echellogram and the simulated 
result from ray tracing. Firstly, small part of the energy gets leaked out of 
the free spectral range of each order due to imperfect blazing structure of the 
echelle grating dependent on the manufacturing accuracy. Secondly, width of 
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each order is widened compare to the simulated results, since the ray tracing 
simulation does not take into account the diffraction limit, as is illustrated in 


Figure 


Several exemplary spectra are generated and compared against the target, 
which demonstrates that the spectral filtering is relatively accurate. A series 
of Gaussian spectra are generated and measured to investigate the wavelength 
tuning resolution of the system when operated as a scanning source. Results 
have shown that the system is responsive to a step size of 0.01 nm. 


Currently three major factors are limiting the performance of the system. 
Firstly, the supercontinuum laser, which is used in the system as the primary 
input to the programmable spectral filtering system, has a limited spectral 
stability. On one hand, the intrinsic spectral variation of the supercontinuum 
laser gets directly passed to the final output. On the other hand, as the cali- 
bration process takes a relatively long period of time, the spectral variation is 
also transferred into the calibration matrix, reducing its accuracy. Secondly, 
like most echellogram systems, the programmable filtering setup is very sen- 
sitive to mechanical vibrations, since tiny movement of the echelle grating 
shifts wavelengths by multiple pixels. Currently Sorbothane feet are attached 
to the breadboard to absorb vibration. Nevertheless, optical table with ac- 
tive self-leveling isolators would definitely enhance the stability of the sys- 
tem. Last but not least, resolution and step size of the USB fiber spectrometer 
is limiting the calibration accuracy. A spectrometer with less measurement 
range but higher resolution would be more suitable for the calibration task. 


In summary, the proposed system is potentially useful for any optical appli- 
cations where manipulation of the wavelength is necessary. In particular, the 
system provides a versatile prototyping platform for measurement systems 
based on chromatic principle as well as hyperspectral imaging technologies. 
The output light from this programmable light source is directly inserted into 
the programmable array microscope to form the AdaScope, based on which 
various measurement methods are developed. 
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3.2 Programmable Array Microscope 


In this section, a programmable array microscope with axial chromatic encod- 
ing is presented. Using the aforementioned programmable light source as the 
illumination source, the presented system is capable of adaptively changing 
its measurement mode through electronic control of the light source as well 
as the programmable array. 


3.2.1 System Design 


The programmable array microscope setup is similar to the conventional 
reflective confocal scanning microscope, except that a DMD is used as a 
spatially-programmable light source and a camera is used to measure all 
lateral locations simultaneously. As a reflective setup, the system can be 
considered as being composed of two parts, i.e., the illumination arm and 
the imaging arm, which share the same chromatic optics for axial chromatic 
encoding. 
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Figure 3.21: System schematic of the programmable array microscope with chromatic encoding. 
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As illustrated in Figure light transported through the liquid light guide 
from the programmable source is first collimated and homogenized through a 
group of homogenization optics before the light is projected into a rectangu- 
lar illumination field upon the DMD using an achromatic doublet. The DMD 
model used in the microscope setup is the same model as the one in the tun- 
able light source. Each pixel of the DMD acts as a secondary point source 
which can be programmably addressed. After passing through a series of col- 
limation lenses and getting reflected by the pellicle beamsplitter (BP245B1 
from Thorlabs), light from selected DMD pixels is projected onto the tar- 
get sample using an objective lens with designed chromatic separation along 
the optical axis (Precitec CLS4). The reflected light passes through the chro- 
matic objective once again and gets focused onto an sCMOS camera (Andor 
Zyla 5.5). 


ACL5040U AC508-150 


LLG0538 


Figure 3.22: Schematic and CAD rendering of the beam homogenization optics. 


The major difference between the programmable array microscope and a con- 
ventional single point scanning microscope is the additional requirement of 
homogenized illumination on the programmable pinhole array. In the Ada- 
Scope system, a beam homogenization module has been developed based on 
the combination of a condenser lens, two microlens arrays and an achromatic 
doublet, as shown in Figure Two microlens arrays are utilized since both 
the diffraction effect and the flat-top broadening are minimized compared to 
a single microlens array [Bueo2]. The specifications of the condenser lens, the 
microlens arrays and the achromatic doublet are specifically chosen in com- 
bination according to the output diameter and NA of the liquid light guide, so 
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that the resulting rectangular illumination field is only slightly larger than the 
effective area of the DMD, thus fully utilizing the power of the programmable 
light source. To get the optimum configuration of the various components, 
the exit facet of the liquid light guide is treated as an extended light source. 
Each point of the facet emits a bundle of light rays (shown as blue or red in 
Figure B.22), which is collimated by a condenser lens. Depending on the po- 
sition of the point source on the facet, the collimated light forms a certain 
angle with respect to the optical axis. Then the collimated light immediately 
passes through a microlens array generating a two-dimensional array of focal 
points. A secondary microlens array with the same specification is placed at 
the focal plane of the first array to rectify the beam bundle of each focal point. 
Lastly, an achromatic lens projects the light rays onto a rectangular field. A 
customized adapter has been designed and manufactured for the mounting of 
all components. 


3.2.2 Simulation and Construction 


DMD Tube Lens Chromatic Objective 


Figure 3.23: Simulation of the illumination arm. 


The microscope system is fully simulated and analyzed with OpticStudio in 
the sequential mode. The wavelength range of the system is specified to be 
from 480 nm to 680 nm and a black-box model of the chromatic objective is 
adopted from the manufacturer. As demonstrated by Figure a group of 
three achromatic doublets are chosen to form an infinity-corrected tube lens, 
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which matches the field of view (FoV) and aperture ofthe chromatic objective 
with the effective area of the DMD as well as the illumination NA. 
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Figure 3.24: Axial chromatic shift in object space. 


Due to the intrinsic design of the chromatic objective, the axial focus shift is 
slightly non-linear (Figure B.24). Additionally, the field curvature has been 
slightly increased due to the introduction of the simple tube lens. Neverthe- 
less, both factors can be easily corrected through experimental calibration 
and postprocessing. Apart from these aberrations, the designed combination 
of the tube lens and the chromatic objective generates diffraction limited illu- 
mination spots in the object space for all wavelengths at their corresponding 
focal plane. The imaging arm is the same as the illumination arm with an 
identical tube lens placed before the camera and the camera sensor is located 
at the conjugate location of the DMD. When a mirror is placed at the illu- 
mination focal plane of each wavelength, diffraction-limited focus points are 
achieved at the camera plane. For wavelengths of 480 nm, 580 nm and 680 nm, 
the airy disk has a diameter of 4.5 um, 5.4 um and 6.4 um respectively, which 
matches the pixel size of 6.5 um in the camera sensor. The three dimensional 
measurement volume is roughly 5.4mm (x) by 3mm (y) by 4.67 mm (z) for 
the specified wavelength range. 
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Figure demonstrates the constructed programmable array microscope 
setup. The system is aligned through the following steps. Firstly, axial po- 
sition ofthe camera is fixed so that perfect focus is achieved through the tube 
lens when a parallel light source is used from the direction of the beam trap. 
The lateral position of the camera is defined by the mechanical tube connec- 
tion between the camera and the beamsplitter assembly. Secondly, axial po- 
sitions of the DMD and a patterned reflective sample are adjusted iteratively 
until the intrinsic pattern from the sample and the DMD pattern are both in 
focus for illumination with a single wavelength. The lateral position of the 
DMD is adjusted in the meanwhile to remain as centered in the camera im- 
age as possible. Thirdly, angle of the DMD is adjusted so that perfect focus 
is achieved through the complete field. Last but not least, illumination angle 
of the beam homogenization setup is adjusted to meet two conditions. On 
one hand, captured intensity on the camera should be as high as possible for 
better SNR. On the other hand, blurring of a single focal spot is checked by 
shifting a reflective sample axially so that the blurred spot is as balanced and 
homogenized as possible. In practice, multiple iterations through these steps 
are performed to achieve an optimum alignment. 


3.2.3 Camera Calibration 


One key aspect which separates the area confocal scanning system from a 
conventional single point system is its requirement of calibration between 
the camera sensor and the light source. Considering the telecentric design 
of the system, the camera is calibrated for a single wavelength of 555 nm, by 
placing a mirror at its focal plane. The registration is made only between the 
camera coordinates and the DMD coordinates, while the registration toward 
the object/world coordinate system is not considered. Therefore, all measure- 
ment results can be laterally presented in the DMD coordinate system by first 
projecting the DMD coordinate system onto the camera frame and then mak- 
ing an interpolation, as demonstrated in Figure More details regarding 
the calibration procedure can be found in Appendix 
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Figure 3.25: Setup of the programmable array microscope with chromatic encoding. 
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Figure 3.26: Camera calibration for Adascope. Left: camera coordinate system. Right: spatial 
DMD coordinate system. 


With a broadband mirror as the target, the spectral sensitivity of the imag- 
ing system is also calibrated and incorporated into all measurement methods. 
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Variations in the spectral sensitivity mainly originate from the spectral re- 
sponse of the pellicle beam splitter as well as the camera sensor. 


3.2.4 Illumination Generation 


The adaptive nature ofthe AdaScope system mainly originates from its ability 
to generate arbitrary 3D illumination fields. The temporal spectral flux density 
over the spatial DMD can be calculated as 


&,(,t) 
ST 


Ep (Xp, Yb-Ast) = ; (3.11) 


where Sı represents the area of the homogenized illumination from the pro- 
grammable light source. The temporal illumination intensity distribution U 
can be computed through the 2D convolution between the temporal spectral 
flux density and a normalized chromatic intensity point spread function (PSF) 
Hı(x,y,2,A). The lateral coordinates of the temporal spectral flux density have 
to be scaled by the paraxial magnification M4. 
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A constant factor represented by c is utilized to match the scaling and the 
unit. For a relatively sparse illumination pattern, crosstalk between adjacent 
illumination positions can be ignored and the intensity illumination function 
can be approximated by 


Ay (x,y,z,A) = 6(x)5(y)6 A - g (2)), (3.13) 
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where g(z) represents the axial chromatic shift. 


With this approximation, the illumination intensity distribution can be inte- 
grated over time and wavelength: 
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The target of the illumination generation process is to find the combination 
of r, and m which could generate the desired U with a minimum exposure 
time T, while still maintaining all physical constraints. Similar to the process 
demonstrated in Equations b.d and B.7] Equation can also be discretized 


into a matrix form: 


U = Ey RR], (3.15) 


where the matrix U € R”:*™ contains the 3D illumination intensities, which 
are reshaped into a 2D matrix. One axis denotes the n, axial positions while 
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the other axis represents the m 2D lateral indices. The matrix Ey € R”-""a 
is generated based on E, through mapping between the wavelength and the 
axial position. The matrix R, € R”*™ contains n; reflectance configurations 
of the n, spectral DMD pixels. And the matrix Ry € R”*"! contains n; re- 
flectance configurations of the n, spatial DMD pixels. The pseudo-inverse of 
E, can be multiplied to both sides of the equation: 


El (Ex El) U = RR}. (3.16) 


The first step of finding the optimum configuration of the two DMDs would 
be to get a nonnegative full rank decomposition of the left side (which is not 
guaranteed to be nonnegative itself). This is not a simple task as the problem 
of nonnegative matrix factorization has been proven to be NP-hard [Vav10]. 
Additionally, the discretization of time steps is yet to be considered. Due to the 
bit-plane mixing capability of the DMD controller, the efficiency of multiple 
binary patterns is lower than, e.g., 8-bit patterns. Meanwhile, Equation 
is equivalent to Equation only when the discretization is based on binary 
patterns, i.e., at least one matrix from R, and Ry has to be binary matrix. In 
practice, for most of the simple illumination distributions, either R, or Ry is 
determined first in an empirical manner, while the other matrix is computed 
afterwards. Sucha process is sufficient for the methods proposed in this thesis. 


3.2.5 Synchronization Mechanism 


The Zyla 5.5 sCMOS camera in the system has a very good signal-to-noise 
ratio (1.2 e7 read noise) and a high dynamic range (25000:1) but the speed 
(maximum 40 fps) is not particularly fast compared to some other high-speed 
camera models. To fully utilize its potential, the camera (controlled by the 
host computer) is applied as the master in the synchronization mechanism 
and triggers the spectral DMD in the programmable light source, which then 
triggers the spatial DMD in the area scanning microscope, as shown in Fig- 


ure 
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Figure 3.27: Triggering mode #1. The camera triggers the spectral DMD to start a series of il- 
lumination spectra. Each spectrum triggers a corresponding pattern on the spatial 
DMD. 


Both DMDs are empowered with a binary pattern rate of 9.5kHz, but the 
transmission of the patterns turns out to be the bottleneck of the pattern gen- 
eration speed. Although the HDMI interface on the control board is capa- 
ble of a wider transmission bandwidth, accurate synchronization of the two 
DMD becomes more difficult when both are connected through HDMI. There- 
fore, the on-board USB 1.1 interface is utilized in the pattern-on-the-fly mode, 
which allows for accurate triggering between the various components. 


Camera 


Figure 3.28: Triggering mode #2. The spectral DMD triggers the camera to capture a series of 
illumination spectra. The spatial DMD is operated directly through the host PC. 


In certain situations, synced operation of the spatial DMD is not necessary 
so only synchronization between the camera and the spectral DMD is main- 
tained while the spatial DMD is directly controlled by the host computer, as 
illustrated in Figure In this case, it is particularly useful to apply the 
spectral DMD in video mode as the master which triggers the camera, so that 
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a larger number of complex patterns can be displayed by the spectral DMD 
without any interruption, which would have exceeded the internal memory 
ofthe DMD if operated in pattern-on-the-fly mode. 


3.3 Summary 


Based on a holistic design approach, the AdaScope system is composed of 
two parts. The programmable light source is developed based on the echel- 
logram of a supercontinuum laser source, which is spatially filtered by a DMD. 
Through electronic control of the DMD, arbitrary output spectra can be gen- 
erated in a time-multiplexed manner with high resolution and fast speed. 
The output light from the programmable light source is transmitted to a pro- 
grammable array microscope through a liquid light guide, which is then ho- 
mogenized and projected onto a secondary DMD to form an extended spatially 
programmable source. The microscope setup adopts a reflective configuration 
based on a beamsplitter and a chromatic microscopic objective is utilized for 
axial chromatic encoding. 


In conventional confocal scanning systems, the scanning volume, which is 
formed by the scanning range of the three axes, can be seen as a three dimen- 
sional grid of discrete scanning locations, which are limited by the resolution 
of the scanning system. Each time only a fixed single point or point array 
can be measured. However, in the AdaScope system, through synchroniza- 
tion between the camera, the spectral DMD and the spatial DMD, an arbi- 
trary combination of scanning locations within the scanning volume can be 
addressed within a single frame of exposure in a time multiplexed manner. 
Moreover, different scanning locations can be weighted in a single frame by 
manipulating the spectral intensity of the illumination light. As the key fea- 
ture of the AdaScope, such adaptability allows novel measurement methods 
to be developed while the AdaScope is operated in different modes. 
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Conventional 3D microscopic systems are dominated by the dilemma between 
scanning density and measurement accuracy, which is qualitatively expressed 
in Figure Best accuracy and robustness is achieved when a single fo- 
cal point is scanned in a confocal system (area A in Figure kab. As the den- 
sity of simultaneously scanned locations increases, regardless of which mea- 
surement method is adopted, the degree of crosstalk also increases, which 
degrades measurement accuracy (area B in Figure k. At a certain point, 
the confocal condition is no longer maintained and the system is reduced 
to a wide-field microscope (area C in Figure ka. Although methods based 
on shape from focus can be adopted, the measurement accuracy is generally 
worse than the equivalent confocal scanning system and is more sensitive 
to the underlying texture of the sample surface, which further degrades the 
robustness of the system. 


Over the years, various measurement methods have been developed to push 
a particular section of the boundary toward the upper right direction. Instead 
of only relying on incremental improvements of any particular methods, the 
AdaScope system attempts to tackle this problem with a new kind of mea- 
surement strategy. Thanks to the intrinsic adaptability granted by the design 
and construction of the AdaScope, the system is capable of swiftly switching 
between different operational modes, i.e., between section A, B and C in Fig- 
ure For a complete measurement task, a cascade of measurement methods 
can be developed and applied, where raw and fast measurement result in one 
stage can be fed to the next stage of measurement as prior knowledge for more 
accurate measurements, as demonstrated in Figure Such prior knowledge 
can be utilized by facilitating the initialization of the new measurement or 
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Accuracy / Robustness 


Density / Speed 


Figure 4.1: Dilemma of scanning density and measurement accuracy. A: single point confocal 
scanning. B: slit/array scanning. C: Shape from focus. 


constraining the measurement range. This strategy allows advantages of each 
mode to be maximized, achieving optimum efficiency of the hardware. 


- 


High Low 


Speed 
Low High 


Accuracy 


Figure 4.2: A new measurement strategy enabled by AdaScope. 


In the following sections, four different methods are investigated for the cas- 
cade measurement strategy. Section R.ilpresents the compressive shape from 
focus method, which is developed for the pre-measurement stage. The target 
is to perform a fast measurement with a minimum number of camera frames. 
In Section one candidate for the main measurement, namely the iterative 
array adaptation method, isintroduced. Alternatively, the direct area confocal 
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scanning method is investigated in Section Æ For post-measurement refine- 
ment, dynamic localized confocal scanning based on Bayesian experimental 
design is discussed in Section k4 


4.1 Compressive Shape from Focus 


The fastest measurement mode of AdaScope is achieved when all lateral lo- 
cations are measured simultaneously, i.e., all pixels of the spatial DMD are 
turned on. In this case, no lateral scanning is required and the system be- 
haves similarly to a shape from focus setup, where the acquisition of the focal 
stack can be implemented through scanning of the illumination wavelength. 


Estimation accuracy of the conventional shape from focus techniques is 
strongly coupled with the number of images in the focal stack, thus limiting 
the measurement speed. In this section, a novel compressive shape from 
focus method is proposed with an exemplary algorithm based on the modified 
Laplacian operator (LAPM) and principle component analysis (PCA). Sim- 
ulation with synthetic focal stacks have demonstrated comparable results 
to the conventional method. A test with six compressively captured images 
achieves the same level of performance to that of the conventional method 
with 100 images. Several other focus measure algorithms are also imple- 
mented and tested under the compressive scheme, which demonstrates the 
wide applicability of the proposed method. 


4.1.1 Background 


The key concept of recovering depth information from a focal stack is the 
relationship between focused and defocused images. In a thin lens model, 
image points that are sharply projected on an image plane fulfill the Gaussian 
lens equation (Figure p3): 


EN: (4.1) 
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Figure 4.3: Imaging of a thin lens. 


where q is the distance of the object point from the lens plane, d, denotes the 
distance of the focused image from the lens plane and f represents the focal 
length of the optical system. When the detector is moved away from the 
focus position, the image of the object will be blurred. The degree of blurring 
depends on how far away the object is from the in-focus position as well as 
the characteristics of the imaging system. 


Figure 4.4: Sample images from a focal stack using an imaging system with a small depth of 
field. 


Utilizing the relationship between the blur and the distance to the focus, con- 
ventional shape from focus methods are composed of mainly three steps. 
Firstly, a stack of images are captured while the focus of the imaging system 
is shifted with respect to the object. This is typically implemented through 
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either mechanical scanning of the camera/sample, or motorized focus shift- 
ing with the lens. Secondly, a focus measure value is calculated for each pixel 
of every image in the stack to form a 3D focus measure cube, where two 
dimensions represent the transverse spatial coordinates corresponding to the 
camera pixels and one dimension denotes the axial shift coordinate. The focus 
measure value can be calculated with various algorithms to evaluate how well 
the underlying pixel is in focus. Last but not least, depth information of each 
pixel is retrieved based on its focus measure curve. Regardless of the focus 
measure algorithms, the focus measure values for a specific pixel at any axial 
locations within the measurement range typically forms a Gaussian-shaped 
signal, which is similar to the axial confocal signal illustrated in Figure 
A naive approach is to take the axial position with maximum focus measure 
value as the axial position of the object at this lateral location. More sophis- 
ticated approaches involve fitting of the focus measure curve as well as opti- 
mization techniques, such as total variation regularization, which have been 
discussed in Section =Æ! 


According to estimation theory, the measurement uncertainty is limited by 
the Cramér-Rao lower bound [van07], which is closely related with the gra- 
dient of the signal. For a Gaussian-shaped signal, the maximum gradient is 
determined by the width of the peak. Since a narrower peak leads to more 
accurate measurement, shape from focus systems typically aim to achieve a 
depth of field as small as possible, by using optical systems with larger aper- 
ture. Nevertheless, a narrower depth of field requires that the acquisition of 
the focal stack has to be conducted more densely, which can be very time 
consuming. To solve this problem, a method for compressive acquisition of 
the focal stack is developed. 


4.1.2 Linear Measurement Model 


Various real-world signals can be viewed as an n-dimensional vector x € R™, 
including sound, image, etc. In a linear measurement model, each measure- 
ment of the target signal is a linear combination of all values in the vector x. 


67 


4 Cascade Measurement Strategy 


The complete measurements of the signal can be written as an n,-dimensional 
vector y = Ax € R™ with a measurement matrix A € R™*”. 

The ultimate goal of the linear measurement model, like any other measure- 
ment system, is to retrieve the signal x and the information it is carrying. For- 
mulation of the linear measurement model as a linear system naturally leads 
to a classical problem of linear algebra: conditions for solving the equation 
y = Ax. In this context, this question is equivalent to what kind of measure- 
ments are needed in order to recover the signal. 


Although prevented by the classical theory of linear algebra, recent devel- 
opments in compressive sensing have shown that an underdetermined linear 
system can be uniquely solved provided sufficient prior knowledge [Canog]. 
In the case of compressive sensing, such prior knowledge refers to the as- 
sumption of sparsity. However, this is not the only possible prior knowledge. 
From a more general perspective, the underdetermined linear system with 
prior information represents a linear manifold learning problem where the 
prior information acts as the boundary of the manifold to be learned by its 
low-dimensional projection. The fundamental philosophy behind solutions 
of such problems is that the information embedded inside the high dimen- 
sional manifold is intrinsically of low dimension. In the case of compressive 
sensing, the unknown manifold is limited to hyperplanes spanned by a limited 
number of axes which corresponds to the sparsity assumption. 


The significance of the linear measurement model to conventional SFF ap- 
proaches is that the number of images required in the focal stack can be ef- 
fectively compressed if each image can act as a linear combination of all orig- 
inally required images in the focal stack. The focus measure stack can then 
be retrieved from the focus measure values calculated from the compressed 
images. It should be noted that this is only possible when the focus mea- 
sure operator is linear, which is rarely true for modern focus measure op- 
erators. Fortunately, most focus measure operators are composed of several 
sub-operators, and as long as there is atleast one linear sub-operator before all 
nonlinear sub-operators, the reconstruction can be inserted. In other words, 
the first sub-operator applied on the compressed images must be linear. 


68 


4.1 Compressive Shape from Focus 


Reconstruction 


Focus Measure 


Figure 4.5: The compression and recovery steps must be added before all non-linear operators 
due to their linear nature. Blue: linear operators. Red: non-linear operators. 


The algorithm for the reconstruction of the focus measure stack depends on 
the prior knowledge, i.e. the focus measure operator. On one hand, when 
the focus measure curve has a defined peak, recent compressive sensing algo- 
rithms can be incorporated for the recovery ofthe whole curve. On the other 
hand, if a training process is allowed or assumptions regarding the focus mea- 
sure curves can be made, conventional methods like PCA can be applied in this 
scheme to yield the measurement/compressing matrix and the corresponding 
reconstruction matrix. 


4.1.3 Compressive Algorithm 


To explain the idea in a concrete and clear manner, an exemplary algorithm 
is presented in this section. The schematic of the algorithm is illustrated in 


Figure kd 


The measurement matrix forming the compressed images and the reconstruc- 
tion matrix for decompression are designed by a training process. In this pro- 
cess, conventional SFF procedures are implemented on a sample focal stack so 
that the focus measure curve for each pixel is calculated. All the focus measure 
curves are then assembled, with which PCA is conducted. The largest compo- 
nents are combined to construct the measurement matrix for the compressed 
images and the reconstruction matrix is simply the transpose of the measure- 
ment matrix. The focus measure curve can be reconstructed by multiplying 
the focus measure values of the compressed images with the reconstruction 
matrix: 
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Figure 4.6: Schematic of compressive SFF with the LAPM operator. 
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x, = Aly = ATAx, (4.2) 


where x, is the original focus measure curve. 


The widely accepted modified Laplacian operator (LAPM) is selected for the 
calculation of the focus measure [Nay94]. It consists of two sub-operators. 
Firstly, a one dimensional Laplacian filter is constructed as fap = (-1,2, - 1)! 
and used to filter the image in both X and Y directions respectively. Secondly, 
the absolute values of the two filtered images are summed as the final focus 
measure value. Apparently the 1D filtering operation as a convolution is lin- 
ear while taking the absolute value is non-linear. Therefore the training and 
reconstruction step must be inserted before taking the absolute value. From 
the recovered datacubes of the filtering results in X and Y directions, the final 
focus measure value can be computed through the sum of the two absolute 
values. Then for each pixel, the axial focus measure curve is smoothed before 
the maximum value is located to estimate the axial depth. 


4.1.4 Simulation and Discussion 
4.1.4.1 Dataset Construction 


To demonstrate the applicability of the proposed algorithm, simulation is im- 
plemented in Matlab with a series of datasets synthetically generated through 
programs provided by Pertuz et al. in their survey study [Perı3). The genera- 
tion of the focal stacks is based on a non-linear shift-variant model of defocus. 
All estimation results shown in this section are smoothed with a mean filter 
(window size = 5). 


To investigate into the influence of the training dataset on the compressive 
SFF result, two texture maps and two depthmaps are combined to form four 
different datasets. Texture #1 is a structured concentric pattern while texture 
#2 isa random pattern. Depthmap #1 is a linear ramp and depthmap #2 is part 
of a sphere. The four datasets #1-#4 are formed with combination of textures 
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Figure 4.7: Textures and depthmaps used to sythesize focal stacks. 


and depthmaps in the following order: #1 and #1, #1 and #2, #2 and #1, #2 
and #2. 


4.1.4.2 Simulation Result 


Table 4.1: RMS error showing influence of training set on testing result. 


RMS Error (x10°3) | Set#1 Set#2 Set#3 Set #4 
No Training 1.69 1.74 4.92 3.25 
Training Set #1 N/A 0.62 1.19 0.79 
Training Set #2 1.11 N/A 2.98 0.72 
Training Set #3 4.23 2.40 N/A 0.46 
Training Set #4 6.95 2.52 5.58 N/A 
Training Set #1-#4 3.65 1.93 1.32 0.54 


Results of CSFF are compared with those of conventional SFF using the root- 
mean-square (RMS) error with respect to the ground-truth depthmaps, which 


72 


4.1 Compressive Shape from Focus 


are listed in Table For the conventional method, a focal stack of 61 im- 
ages are generated in each case and for the CSFF method the 61 images are 
compressed into 6 images. The row labeled as no training represents the con- 
ventional case whereas the other rows are labeled with their corresponding 
training set, which is used to generate the measurement matrix and the recon- 
struction matrix. It can be seen from Table 4.1|that the choice of the training 
set has an influence on the testing result. In general, the compressive results 
are comparable to the conventional results but require much smaller numbers 
of compressively captured images. The test result of set #1 with training set 
#2 and the test result of set #4 with training set #1 are illustrated in Figure R.3l 
Several cases in Table show that the CSFF method achieves even lower 
error than the SFF method. This is partly due to the fact that the compression 
and reconstruction process effectively applies smoothing to the focus measure 
curve, which is typically quite noisy in the conventional SFF method. 


To investigate into the number of images needed for SFF, a series of focal 
stacks with different numbers of images is synthesized based on texture #1 and 
depthmap #1 (same combination as dataset #1 used in previous simulations). 
As expected for the conventional SFF scheme, when the number of images 
increases, the RMS error decreases, indicating better estimation result. This 
is due to the fact that the simulated imaging system for image synthesizing 
has a limited depth of field defined by the blurring kernel. When the step 
between two adjacent images is too large, the areas with depth in the interval 
between two focal planes will never get the chance to be imaged sharply and 
thus cannot be estimated robustly. With the conventional SFF scheme, the 
minimum number of images needed for robust estimation depends largely on 
the depth of field of the imaging system, which determines the width of the 
peak in the focus measure curve when using an operator like LAPM. Generally 
speaking, when the step size is larger than the width of the focus measure 
curve, measurement artifacts will start to appear in the estimation result. The 
dependency of the estimation accuracy on the number of images in the focal 
stack is illustrated by the blue curve in Figure 4.9} With the current optical 
configuration, the conventional SFF method reaches the performance limit at 
approximately 85 images. 
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Figure 4.8: Depth estimation results with conventional SFF and compressive SFF. The left column 
illustrates result of the test set #1 with training set #2 and the right column illustrates 
result of the test set #4 with training set #1. 


On the contrary, in CSFF, regardless of the number of compressive images 


to be captured, each image acts as a linear combination of all focal planes 
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Figure 4.9: Dependency of the estimation accuracy on the number of input images. The error is 
plotted in logarithmic scale. 


within the measurement range, and thus contains information from all fo- 
cal positions in an encoded manner. A training dataset based on texture #1 
and depthmap #2 is synthesized with 100 images. The number of compressive 
images is solely determined by the number of largest principal components 
to be selected for the construction of the measurement matrix. As shown in 
Figure Ed CSFF allows much less images to be captured to achieve the same 
level of estimation accuracy as the conventional method. With a number of 
6 images, the estimation performance reaches the limit, which is comparable 
to the performance of the conventional SFF method with 70 images. As the 
number of channels increases, the performance drops due to the problem of 
overfitting. With more than 16 principle components, the measurement ma- 
trix is increasingly adapted to the training dataset, resulting in the rise of the 
RMS error for the testing dataset. It should be noted that results presented 
above demonstrate the feasibility of the method only on the theoretical level 
with synthetic datasets. In practice, the performance of both methods could 
be degraded by various sources of noise. 


As the information contained in the largest principal components is related 
with the rank of the matrix, it is preferred to have a matrix with a relatively 
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low rank so that more information is contained within a smaller number of 
principle components. This means that the width of the focus measure curve 
should be larger, i.e., the imaging system should have a larger depth of field. 
However, as the width gets larger, the relative magnitude ofthe focus measure 
peak gets smaller, effectively reducing the SNR of the measurement. There- 
fore, a balance must be made between these two factors to achieve the best 
estimation performance. 


4.1.4.3 Applicability 


To investigate the applicability of the compressive approach, several other 
focus measure algorithms are implemented and simulated with the syn- 
thetic datasets, including the diagonal Laplacian operator (LAPD) [Theog], 
the Tenegrad algorithm (TENG) and the steerable filters algo- 
rithm (SFIL) [Minog]. Similar to the LAPM, the LAPD also applies Laplacian 
operators to the captured images, but in two additional diagonal directions. 
Based on the gradients of the image, the Tenegrad method applies Sobel 
filters to the image in both directions and then calculates a squared sum. 
The steerable filters algorithm is a sophisticated modern algorithm that has 
attracted quite a lot of attention. The focus measure value is calculated using 
steerable filters in several directions, which are designed in quadrature pairs 
for better control over phase and orientation. The maximum of the filtered 
results is taken as the focus measure. Mathematical details regarding these 
algorithms can be found in the respective literature. 


Two groups of tests are conducted using the compressive approach proposed 
previously with all four focus measure algorithms. For group #1, dataset #1 
is tested using the training result from dataset #2. For group #2, dataset #4 is 
tested using the training result from dataset #1. All algorithms are modified 
so that the compression and reconstruction processes are inserted before any 
non-linear operations. As shown in Figure all four focus measure algo- 
rithms have provided similar results under the compressive scheme. There- 
fore, the compressive scheme is in general not very sensitive to the selection of 
the focus measure algorithms as long as the linearity condition mentioned in 
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Figure 4.10: Comparison of CSFF using different focus measure algorithms. Group #1: test set 
#1 with training set #2. Group #2: test set #4 with training set #1. 


Section is satisfied. Nevertheless, for both groups of tests, the steerable 
filters algorithm has demonstrated noticeably worse results than the other 
three algorithms. As a more sophisticated algorithm, SFIL should in princi- 
ple generate better results than the other three algorithms when applied in 
conventional SFF. However, due to its higher complexity, the added noise 
introduced by the compression and reconstruction might have more severe 
influence over the final focus measure calculation, leading to a worse overall 
result. This implies that the degeneration caused by the compression is pos- 
sibly more severe for more complicated focus measure algorithms, which has 
to be taken into consideration when applying compressive shape from focus 


in practice. 


4.1.5 Summary 
In this section, a novel method of compressive shape from focus is presented 


and simulated. Based on the linear measurement model, the CSFF method 
compressively captures several images, each as a linear combination of all 
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possible focal planes within the measurement range. It has been shown in 
the simulation that the estimation error of CSFF is comparable to the conven- 
tional method using the same number of images as the number of images in 
the training set for CSFF. With the synthesized datasets, CSFF with 6 com- 
pressive images yields comparable performance to the conventional method 
with a focal stack of 70 images. Apart from LAPM, several other focus mea- 
sure algorithms are also tested under the compressive approach, indicating 
wide applicability of the method. 


4.2 Iterative Array Adaptation for 3D 
Confocal Scanning 


As presented by the simulation result in Chapter the compressive shape 
from focus method is able to retrieve the 3D surface profile with a minimum 
number of images. In practice, the focus measure operators are generally 
very sensitive to camera noise and the choice of an optimum focus operator 
depends highly on the surface texture, degrading the robustness of the method 
in real applications. Nevertheless, measurement accuracy for smooth surfaces 
is often sufficient to significantly restrict the axial measurement range, so that 
a more accurate measurement method can be initiated. 


In this chapter, an iterative array adaptation method is proposed for confo- 
cal 3D scanning. Unlike conventional array scanning methods where a fixed 
pitch distance is specified, the pitch distance and the axial measurement range 
are collectively adapted in an iterative manner in order to achieve a higher 
scanning efficiency. 


4.2.1 Motivation and Concept 


The idea originates from the observation that the uncertainty of the chromatic 
confocal measurement is in fact coupled with the lateral density of measure- 
ment locations, as shown in Figure When little information of the mea- 
surement locations is gathered, the crosstalk could potentially be very large 
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and therefore a larger distance between adjacent points is required. As the 
measurements at each point become more and more accurate, the possibly 
generated crosstalk also gets smaller which allows for a denser measurement 
array. 


N X N 
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Figure 4.11: Coupling of axial measurement uncertainty and lateral measurement density. 


Based on this observation, the measurement is conducted in several iterations. 
In each iteration, measurements with limited accuracy are made for all posi- 
tions through array scanning with a fixed pitch distance. Based on the result 
from one iteration, more refined measurements are made with a denser grid 
in the next iteration. 


4.2.2 Axial Measurement Refinement 


In each iteration, a two-channel linear measurement system is applied to a 
scanning array of measurement locations. The two measurement filter func- 
tions are two ramp-shaped functions in opposite directions. To measure the 
axial location of the corresponding chromatic confocal peak, illuminations 
with spectra in the shape of the measurement functions are applied and the 
corresponding images are captured. As Bernstein polynomials of degree 1, 
these functions have the nice property that the corresponding linear transfor- 
mation maintains the centroid of the original signal (proof in Appendix A.2). 
Therefore, the centroid of the chromatic confocal peak can be estimated with 
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COG(x) = COG(y) = (4.4) 


where x represents the original confocal signal, y denotes the measurement 
and COG(-) represents the center of gravity. The illumination spectra are 
represented by f; and f, respectively. 


There are several reasons for using such a linear measurement system. Firstly, 
since multiple iterations are performed, each iteration must be very efficient 
in terms of the number of frames taken. Secondly, the crosstalk at a fixed 
distance should be proportional to the measurement range. This means that 
as the location of the object becomes more certain, the crosstalk should be- 
come smaller. Lastly, the uncertainty should be inversely proportional to the 
measurement range. This means that for a smaller measurement range, the 
sensitivity should be higher. 
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Figure 4.12: Iterative refinement of axial measurement. 


All these properties are achieved by iteratively reducing the wavelength range 
of the illumination according to the previous estimation, such as illustrated 
by Figure The position of the object is represented by the arrow. In the 
first iteration, the camera takes two frames with the two illumination spec- 
tra covering the complete wavelength range. Based on estimation result from 
the first iteration, which is not extremely accurate, the object is determined 
to be in the top half of the measurement range. In the second iteration, the 
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AdaScope makes measurement in the new measurement range with two sim- 
ilar illumination spectra. This appears to be like a binary search, but if the 
measurement in each iteration is accurate enough, the search process can be 
much faster. For example, a direct jump from iteration #1 to iteration #3 will 
also be possible. 


Apparently this method is not very sensitive and is not robust against the 
noise due to the limited number of linear measurement channels, but it should 
be sufficient to bound the measurement range to a certain level for the next 
iteration. 


4.2.3 Lateral Array Condensation 


As mentioned previously, in each iteration, the measurement density is also 
increased accordingly. As shown by the example in Figure in iteration #1 
with a pitch of 20 pixels, the point array has to be scanned by 400 times, and 
in each time, the system makes two measurements using the corresponding 
illumination spectra. In the next iteration, the density of the array can be 
increased depending on how much the new measurement range is bounded. 
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Figure 4.13: Iterative condensation of the lateral array. The DMD pitch distance and the numbers 
of frames per iteration are labeled respectively. 
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Ata certain iteration, based on the estimation uncertainty from the previous 
iteration, the measurement process can be switched to a localized chromatic 
confocal measurement centered around the previous estimation result, in or- 
der to get more accurate measurement results. 


4.2.4 Triggering Mechanism 


As an example, the triggering diagram for the second iteration as well as the 
corresponding illumination spectra are illustrated in Figure The camera 
serves as the master which triggers the spectral DMD in the programmable 
light source. This DMD displays several patterns corresponding to several 
illumination spectra. Each spectral DMD patterns triggers its corresponding 
spatial DMD pattern in the microscope. Based on the estimation from the first 
iteration, all points are already bounded to either the top half or the bottom 
half of the complete measurement range. For each measurement grid, two 
frames are captured. Within each frame, two spectra are projected to two dif- 
ferent spatial patterns. In the second frame, the spatial patterns are repeated 
but the spectra are different. This process is then repeated ns times for com- 
plete measurement of this iteration, where n, denotes the pitch distance of 
the current iteration. 


4.2.5 Summary 


In this section, an iterative array adaptation method for 3D confocal scanning 
is proposed. Instead of keeping a constant pitch distance for the array scan- 
ning, multiple iterations of array scanning can be performed while the pitch 
distance and the axial measurement range are modified dynamically. For the 
axial scanning, a linear measurement method based on the Bernstein polyno- 
mials has been developed, where the axial range is halved from iteration to 
iteration. For the lateral direction, the array density is also increased itera- 
tively. This results in a much more efficient 3D scanning procedure compared 
to the conventional array scanning method. 
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Figure 4.14: Exemplary triggering diagram of iteration #2 and the corresponding illumination 
spectra. 


4.3 Direct Area Confocal Scanning 


Although the iterative array adaptation method improves the scanning speed 
by dynamically changing the axial measurement range and the lateral array 
pitch, in each iteration, a certain pitch distance between the adjacent mea- 
surement locations still has to be guaranteed. Meanwhile, due to the wider 
bandwidth of the illumination spectra, the crosstalk could still affect the mea- 
surement result despite the reduced axial range in each iteration. 


In this section, an alternative method is presented based on an entirely differ- 
ent principle, i.e., direct area confocal scanning. This method is both more ef- 
ficient and more accurate compared to the iterative array adaptation method, 
and thus proves to be a better choice for the main measurement stage in the 
cascade strategy. 
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4.3.1 Theoretical Analysis 


Direct area confocal measurement is generally considered impossible due to 
the fundamental limitations of illumination and imaging. This has been partly 
circumvented by spectrally encoded slit confocal microscope, where one lat- 
eral axis is tackled with a physical slit and the orthogonal lateral axis is cov- 
ered by lateral dispersion of the slit Kim15]. This section presents an 
alternative approach for direct area confocal measurement based on a com- 
pletely different principle, which, as will be demonstrated in the latter sec- 
tions, provides additional benefits for the complete scanning process in the 
proposed system. 


4.3.1.1 Optical Model 


All confocal systems rely on the same principle that unfocused illumination 
light gets spread to the adjacent area. The reflected light is further filtered by 
the confocal pinhole, whether it’s a physical pinhole, a fiber end or a single 
pixel, which finally generates the confocal peak in the detected signal. 


Figure illustrates three different types of microscopes in reflective con- 
figuration, where the illumination arm and the detection arm share the same 
optical system due to the beam splitter. The imaging processes of these mi- 
croscopes will be investigated in details, with x and y representing the lateral 
directions and z representing the axial direction. 


Figure (a) shows a scanning microscope, where a wide-field illumination 
is projected onto the object and a vanishingly small pinhole is applied before 
the detector. Commonly referred to as a type-la microscope, this setup has 
been shown to be equivalent to a conventional wide field microscope [wilsd). 


The intensity response of such a system to a point object can be expressed as 


I(u, v) = |h(u, v)|*, (4.5) 
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Light Source Beamspitter Detector Object Pinhole 


(b) (c) 
Figure 4.15: Three types of microscopes: (a) Type-1a scanning microscope which is equivalent 


to the conventional wide-field microscope. (b) Confocal microscope. (c) Type-la 
scanning microscope with a tilted illumination field. 


where h(u,v) stands for the amplitude point spread function of the optical 
system with respect to the optical coordinates in the object space: 


v= krsina = k,/x* + y? sina, (4.6) 


a 
u = 4k dz sin” a (4.7) 


In these equations, k represents the wave number, sin a is the numerical aper- 
ture and 6z represents a small axial deviation from the focal plane. For 3D sur- 
face profilometry, the more important factor is the integrated axial response 
of the system, which is denoted by J,,;. This factor can be treated as the over- 
all intensity in the image of a point object, or approximately considered as the 
intensity response to a planar diffusing object, which can be written as 
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oo 


Im(u) = 22 | I(u,v)v dv. (4.8) 
0 


Although it has been proven that a slit can be used in a confocal system instead 
of a pinhole with only slightly widened confocal peak [Shess], it is apparent 
that a direct area confocal measurement is not possible. Theoretically, it has 
been proven with Parseval’s theorem that the integrated intensity response 
in Equation kal does not fall off with respect to u. This can equally be argued 
with the conservation of energy. When a wide-field illumination is applied 
to a planar object, all light is reflected and therefore the intensity response 
remains constant as the object is scanned axially. Consequently, such a system 
does not possess the capability of depth discerning. 


On the contrary, for a confocal system illustrated in Figure (b), the inten- 
sity response of a point object is given by 


I(u, v) = |h(u, v). (4.9) 
And the integrated intensity response in the focal region can be written as 


Inu) = 27 [eau + S?(u,v))? v dv, (4.10) 
0 


in which C(u,v) and S(u,v) are defined as [|Wil84) 
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1 
1 
Cu») = | 2005 5 jupp dp (4.11) 
0 
- 1 
S(uv)= | 2sin( jup?) (wpe do, (4.12) 
0 


with J, as a Bessel function of first kind of order n. The integrated intensity 
response can be evaluated numerically, which demonstrates that the inten- 
sity drops as the object moves away from the focal plane. Such phenomenon 
serves as the basis for the depth discerning capability of confocal systems. 
Figure illustrates the axial response as well as the integrated intensity for 
both conventional wide-field microscope and confocal microscope. 
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Figure 4.16: Axial response I and integrated intensity Int for wide-field microscope and confocal 
microscope. 
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Another system is presented in Figure (c), which is similar to a type-la 
scanning microscope illustrated in Figure (a). The only difference is that 
the focal plane of the optical system is rotated with respect to the y-axis. One 
specific implementation of the system is discussed in the following sections 
and currently the optical system is treated as a black box which is capable of 
generating such tilted focal field. Since this configuration breaks the radial 
symmetry ofthe system, Cartesian coordinates are used instead ofthe optical 
coordinates. For incoherent imaging, the intensity response of the system to 
a point object at (x,y,z) can be expressed as 


I(x, y,Z) = I H(x -x,y - y’, z- z')H(x,y,z - 2’) F(x’,y’,2’) dx’ dy’ dz’, 
(4.13) 


where H(x,y,z) is the intensity point spread function of the optical system and 
F(x,y,z) is a 3D mask function which defines the illumination distribution. 


The integration shown in Equation has a form similar to a convolution, 
where the first H is shifted three-dimensionally and multiplied by the mask 
function to account for the tilted area illumination field. The second H repre- 
sents the imaging intensity point spread function by the point object. In the 
particular implementation with the AdaScope system, chromatic encoding is 
applied to achieve the tilted illumination field, where the focal length of the 
optical system varies according to the wavelength. To account for this effect, 
the axial coordinate z in the second H is also shifted so that the focal length 
matches that of the illumination wavelength. 


With the on axis focal point position defined as (0,0,0) and the angle between 
the illumination plane and x-axis denoted as 0, the mask function can be ex- 
pressed as F(x,y,2,0) = ö(z - x tan 0). And therefore, Equation can be 
simplified as 
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I(x,y,2,0) = I H(x-x’,y-y’,z-x’ tan 0) H(x,y,z-x’ tan 0)dx’ dy’. (4.14) 
The corresponding integrated intensity response can be expressed as 


Tint(z, 0) = I I(x, y, z, 0) dx dy. (4.15) 


This expression does not have an analytic solution and thus must be evaluated 
numerically. 


4.3.1.2 Simulation result 


In this section, Equation and Equationf.15] are evaluated through numeri- 
cal simulation to investigate it’s corresponding depth discerning capability for 
3D measurement. The intensity point spread function H(x,y,z) is simulated 
based on a fast 3D PSF model for a volume of 80 um x 80 um x 150 um. 
The numerical integration is implemented afterwards. A numerical aperture 
of 0.33 is utilized, which corresponds to the experimental setup and a wave- 
length of 580 nm is specified. 


To understand the simulation results, a simpler case with a tilted focal field un- 
der slit illumination is firstly investigated. In this case, a slit along the x-axis 
is applied to the light source generating a tilted line of focused illumination, 
which forms an angle of 0 with respect to the x-axis. The integrated intensity 
response of such a configuration is simulated and illustrated in Figure 
For each angle 6, the intensity response is normalized so that the maximum 
intensity equals to one. The bright green curve on the bottom represents the 
integrated intensity response of a conventional single point confocal micro- 
scope. As can be seen from the simulation result, for most of the angles, the 
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Integrated Intensity with Tilted Slit Illumination 
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Figure 4.17: Simulation result for tilted slit illumination. Top: axial integrated intensity re- 
sponse as an intensity map. Bottom: axial integrated intensity responses for a series 
of tilting angles in comparison to the confocal signal. 


system remains capable of depth discerning, although the width of the inten- 
sity is slightly broadened. It is worth noting that at 0°, the system is exactly 
a conventional slit scanning confocal system. Whereas at 90°, the system be- 
comes a physically prohibited confocal system with a focal point infinitely 
elongated along the optical axis. Alternatively, it can be seen as a single point 
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chromatic confocal system, with a detector not capable of wavelength discern- 
ing. Therefore, as the angle approaches 90°, the intensity response becomes 
more flat and the system gradually loses its depth discerning capability. 


Integrated Intensity with Tilted Plane Illumination 
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Figure 4.18: Simulation result for tilted plane illumination. Top: axial integrated intensity re- 
sponse as an intensity map. Bottom: axial integrated intensity responses for a series 
of tilting angles in comparison to the confocal signal. 


What’s more interesting is to look at the case where a complete planar light 


source is used to generate a tilted planar illumination field, which is presented 
in the previous section. As shown by the results illustrated in Figure at 
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0°, the system represents atype la scanning microscope, which is equivalent 
to a wide-field microscope. At 90°, the system can be treated as an array 
of chromatic confocal point sensors with broadband detectors or simply as 
a chromatic confocal slit scanning system with a broadband detector. It is 
apparent that both the case of 0° and the case of 90° represent a system which is 
not capable of 3D measurement. However, as the angle varies between 0° and 
90°, an interesting region arises around roughly 75°, where an intensity peak 
is clearly visible, indicating the capability of depth discerning. The FWHM of 
the conventional integrated confocal intensity peak is roughly 7 um and the 
FWHM of the peak at 75° is roughly 32 um. Although the intensity peak is 
several times wider than a conventional confocal peak, this sacrifice leads to 
an imaging system capable of true area confocal scanning. 


Figure illustrates the intensity response of the proposed system when a 
point object is scanned laterally. This is simulated by calculating I(x,y,z,0) 
in Equation through numerical integration. In the y-direction, since the 
plane of illumination is tilted with respect to the y-axis, the response is very 
similar to that of a slit scanning confocal microscope in the direction paral- 
lel to the slit. In the x-direction, the width of the intensity response is less 
affected by the area illumination but is more sensitive to the change of the 
tilting angle. In both cases, the FWHM of the signals are slightly wider than a 
conventional confocal signal, indicating a good lateral resolution very similar 
to a conventional confocal system. 


Based on the simulation results illustrated in Figure the axial FWHM for 
the integrated intensity response can be calculated. Figure demonstrates 
the change of axial FWHM with respect to the tilting angle for three different 
NAs. Similar to a conventional confocal scanning system, the effective axial 
FWHM is reduced as the NA is increased. Additionally, a larger NA allows for 
a larger operational tilting angle range, which moves toward a lower tilting 
angle. Consequently, the optimum tilting angle for the minimum FWHM is 
also reduced as the NA increases. For an NA of 0.33, which will be utilized in 
the experimental setup, the optimum tilting angle is approximately 75°. 
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Intensity in x with Tilted Plane Illumination 
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Figure 4.19: Intensity response of a point object with tilted plane illumination. In both x- and 
y-directions, the intensity response is only slightly wider than a conventional con- 
focal signal. 
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Figure 4.20: Axial FWHM of the integrated intensity with respect to NA and tilting angle 0. For 
each NA, an optimum angle can be located where the axial FWHM is minimum. 


4.3.2 Scanning Mechanism 


To implement the method presented in Section based on the proposed 
setup, several important aspects are studied and described in detail in this 
section. 


4.3.2.1 Illumination Generation 


As demonstrated by the simulation result in Section the depth discern- 
ing capability of the tilted area confocal scanning method is only maintained 
in a small range of angles around 75°. Although there are possible solutions 
based on special optical design (e.g. with a Scheimpflug configuration), the 
imaging quality could be adversely affected. Therefore, in the AdaScope setup, 
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the tilted focal field is implemented through multiplexed chromatic encoding. 
Since the illumination spectrum can be tuned through the first DMD in the 
programmable light source and the lateral locations can also be arbitrarily 
addressed through the second DMD, the combination of the two DMDs al- 
lows the generation of focused illumination field anywhere within the three 
dimensional measurement volume. Through time multiplexing, within the 
exposure time of each camera frame, several pairs of patterns are displayed 
by the two DMDs, forming a series of localized illumination distributions in 
the target space. By controlling the two DMDs simultaneously, it is possible 
to generate any 3D focal field, including the required tilted planar illumina- 
tion field. In this setup, the programmable light source generates a series of 
spectral Gaussian peaks of equal intensity with a FWHM of 1 nm. Each spec- 
tral peak corresponds to one column of lateral DMD pixels. The tilting angle 
of the illumination plane can be calculated through the following expression: 


AA x =) (4 16) 


0 = arctan (a 
dy x Mı 
where AA represents the wavelength step between the adjacent spectral peaks 
and M, denotes the magnification of the illumination arm, which equals to 
0.37. The chromatic focal shift a, , can be expressed as az, = 5z/dA. And dp 
denotes the physical pitch of a single lateral DMD pixel, which is 7.56 um in 
the proposed setup. For example, at a wavelength of 530 nm, the chromatic 
focal shift is approximately 29.13 um nm™!. To generate a tilting angle of 75°, 
a wavelength step of 0.36 nm is required between adjacent columns of pixels. 
Due to the nonlinear chromatic focal shift (Figure B.24), to maintain a lin- 
ear axial spacing, the wavelength step between adjacent DMD pixel columns 
must vary according to the corresponding wavelength. Such implementation 
of tilted planar illumination field differs from the idealized theoretical model 
from Section mainly in two aspects. Firstly, while in the model the il- 
lumination plane is continuous and extends in all dimensions to infinity, in 
practice it is apparent that such illumination is impossible and is approxi- 
mated by a finite illumination plane composed of discrete illuminated loca- 
tions. Secondly, the numerical aperture of the different wavelengths varies 
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slightly instead of remaining a constant value, which is assumed in the theo- 
retical model. Despite these discrepancies, the theoretical model is considered 
a valid approximation to the practical setup. 
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Figure 4.21: Periodic planar illumination with the adaptive microscope. The tilting of the illumi- 
nation field is achieved through the lateral change of the illumination wavelength 
and the axial chromatic focal shift of the optical system. 


Due to the large tilting angle, even for the full axial range of 4.67 mm, the 
effective lateral coverage of illumination in x direction is only 1.25 mm. This 
seems to be a major drawback of the tilted area scanning method. Never- 
theless, thanks to the adaptability of the proposed system, multiple periods 
of illumination planes can be easily configured (Figure k21). Although the 
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boundary area oftwo adjacent illumination periods may be susceptible to ad- 
ditional crosstalk, as will be shown by the experimental result, the adverse 
influence is well within acceptable level. 


4.3.2.2 Scanning Direction 


The confocal scanning is achieved through manipulation of the 3D illumina- 
tion field based on the control of the two DMDs, while the camera records 
one image for each illumination field. As the complete illumination is shifted 
axially, part of the illumination field which is out of the measurement vol- 
ume will be wrapped in from the opposite side. For example, Figure 
demonstrates the course of a scanning process. Since the tilting angle of the 
illumination field is much larger than 45°, it is more intuitive to consider that 
the illumination field is being scanned laterally along the x-axis. In fact, due 
to the two dimensional relative movement and the wrapping of the illumina- 
tion field, scanning in the x-direction is exactly equivalent to scanning in the 
z-direction with a different scanning speed. One additional benefit brought 
by such equivalency is that the direct area confocal scanning method is poten- 
tially applicable on an assembly line, where one period of tilted illumination 
pattern can be fixed while the object is scanned along one lateral direction by 
the transporting system. 
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Figure 4.22: The illumination field is scanned by shifting the illumination wavelengths laterally. 


The total number of images can be adaptively tuned according to the axial 
measurement range and the discretization of the illumination field, which de- 
pends on the required accuracy of the measurement. For example, if 50 illu- 
mination spectra are utilized, which corresponds to 50 columns of pixels on 
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the spatial DMD, a total of 50 images shall be required for a complete three di- 
mensional scan of the surface. By enlarging the FWHM of the spectral peak, 
the axial intensity response is effectively widened due to an overlapping of 
multiple shifted illumination planes. This reduces the axial resolution of the 
system but allows faster scanning, e.g., shifting of the 3D illumination field by 
multiple pixel columns instead of one column. This property further enhances 
the adaptability of the measurement system. 


4.3.2.3 Synchronization Mechanism 


As discussed in Section the sCMOS camera (Andor Zyla 5.5) is the slow- 
est component in the system and is thus applied as the master in the synchro- 
nization in order to fully utilize its potential. The synchronization mechanism 
is adapted from the mode #1 illustrated from Figure where the camera 
triggers a series of spectral patterns, each of which then triggers its corre- 
sponding spatial pattern. 
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Figure 4.23: Triggering diagram. The camera triggers the spectral DMD to start a series of il- 
lumination spectra. Each spectrum triggers a corresponding pattern on the spatial 
DMD. 


To avoid the transmission of new patterns to the DMD between camera 
frames, a special mechanism is invented by adding an additional black 
pattern for the spectral DMD at the end. As demonstrated by the triggering 
diagram in Figure at the beginning of each exposure, the camera sends 
a trigger signal to the spectral DMD, which starts a continuous series of 
n + 1 patterns. Each of the first n spectral DMD patterns corresponds to the 
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illumination of a spectral peak and triggers its respective spatial pattern. 
The last spectral DMD pattern is completely black, which serves to send one 
additional trigger to the spatial DMD. Since the spatial DMD is only loaded 
with n patterns, this additional trigger from the spectral DMD effectively 
shifts and wraps one spatial pattern till the end of the next exposure, i.e., 
the order of the spatial pixel columns is shifted by one pixel. The combined 
effects result in a shifted 3D periodic illumination field from frame to frame, 
as illustrated in Figure In this way, all patterns can be transferred to the 
DMDs off-line, and during the measurement, the camera can be operated in 
a continuous burst mode with its maximum speed. 


4.3.3 Summary 


Despite the fact that a wide-field microscope lacks the capability of depth 
discerning, it has been demonstrated in this section that direct area confo- 
cal scanning is indeed possible as long as the focal field is tilted to an angle 
specific to the NA of the system. The simulation results show that the ax- 
ial confocal response can be largely preserved, yielding an axial FWHM of 
32 um for an NA of 0.33 at the optimum tilting angle. The lateral response 
is slightly wider than that of the confocal case, demonstrating a good lateral 
resolution. Compared to the conventional array scanning, the measurement 
speed is greatly improved as all lateral locations are scanned simultaneously. 


4.4 RNN-accelerated Experimental Design 


Direct area scanning based on the tilted focal field leads to a significant im- 
provement of the scanning speed compared to conventional array scanning. 
Nevertheless, measurement uncertainty becomes slightly worse due to the 
widened signal peak, as shown in Section To reach the same level or 
even surpass the accuracy of conventional confocal array scanning, a final 
stage of refined measurement is required, which is implemented through a 
localized axial confocal scan based on a fixed lateral array with a significantly 
smaller pitch distance. Based on the result of the previous measurement stage, 
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the scan can be initialized and performed directly in the vicinity of the peak 
position. 


Although the localized axial scan can be achieved in a uniform sampling man- 
ner, this section introduces a more efficient scanning method based on Bayes- 
ian experimental design, which can be further accelerated through an approx- 
imation based on a recurrent neural network. 


4.4.1 Chromatic Confocal Signal 


The target of chromatic confocal measurement is to retrieve the depth of the 
measurement position via the location of the Gaussian-like signal peak. From 
the point of view of parameter estimation, the canonical approach is to build a 
measurement model and apply Bayesian inference on the parameters of inter- 
est. The measurement model is composed of two parts, i.e., the signal model 
and the noise model. The signal model describes the relationship between an 
ideal signal, or expectation of the signal, and the parameters to be estimated. 
And the noise model represents the amount of noise added to the ideal model. 


In the case of the confocal measurement, the axial intensity response can be 
derived from Equation Ed, which has an analytical form: 


sin u/4 4 
) f (4.17) 


Te = | u/4 


As the object is positioned axially in a chromatic confocal system, the detected 
signal can be approximated by a Gaussian function, which can be expressed 
by the following equation: 


bbe 2? , (4.18) 


where 0, represents the amplitude of the signal and 6, represents the loca- 
tion of the signal. The parameter 6, is mainly determined by the reflectance 


100 


4.4 RNN-accelerated Experimental Design 


of the object and 6, reflects the axial position of the object. The width of 
the Gaussian-shaped chromatic confocal signal is related with o and is deter- 
mined by the properties ofthe optical system such as the numerical aperture. 
Assuming normally distributed noise, the complete model is expressed as a 
normal distribution over the combination of the signal and the noise: 


g ~ NE), (4.19) 


where 02 describes the variance of the noise and is mainly determined by the 


camera. 


Based on Bayes’ theorem, the parameter estimation task is relatively straight- 
forward by calculating the posterior probability distribution of the parameters 
based on the measurement model. In this case, the parameters of interest are 
0 = (01,02), where 0, contains the depth information and 0, contains informa- 
tion of the object texture. The posterior is proportional to the product of the 
prior and the likelihood. Without any prior knowledge, the prior distribution 
is considered to be flat across the valid support so that all parameter values 
are equally possible when no measurements are made. The likelihood comes 
directly from the measurement model, as shown in Equation Therefore, 
the posterior distribution can be calculated up to a certain scale factor: 


(3) 0 
p(lg) = en (4.20) 
P(g) 
x p(8) p(g/@). (4.21) 


In practice, the calculation of the posterior distribution with high resolution 
is often computationally prohibitive, and therefore sampling techniques such 
as Markov-Chain Monte Carlo (MCMC) method are frequently adopted. For 
parameter estimation of the chromatic confocal signal, an ensemble sampler 
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which is affine-invariant [Go010] is utilized for drawing the posterior sam- 
ples. Once samples are drawn from the posterior distribution, the estimation 
becomes trivial by calculating the average of all samples. 
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Figure 4.24: Posterior sampling after measurements are made.Both the signal amplitudes (g and 
0,) and the wavelengths (A and @,) are shown in a normalized range from 0 to 1. 


Figure demonstrates the procedure of posterior sampling for the chro- 
matic confocal measurement through simulation. In the left figure, the ex- 
pectation of the signal is denoted by g and the simulated measurements with 
normally distributed noise are contained in G . The top right figure illustrates 
the posterior probability distribution of the parameters to be estimated and 
the bottom right figure shows samples drawn from such a distribution. 
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The Bayesian framework has two major advantages for parameter estimation. 
Firstly, the uncertainty of the estimation can be easily derived by calculating 
the variance of the samples. Secondly, the posterior distribution of the param- 
eter allows for the selection of the optimal measurement location in the next 
measurement through Bayesian experimental design, as will be discussed in 
the following. 


4.4.2 Bayesian Experimental Design 


Bayesian Experimental Design (BED) is the subject of making decisions un- 
der uncertainties with limited resource. In the case of measuring a chromatic 
confocal signal, conventional systems utilize a spectrometer which disperse 
various wavelengths onto hundreds of pixels. A major drawback for such 
approach is that the transfer of the intensity data can be quite slow. Addi- 
tionally, in the case of an array chromatic confocal system, the application of 
multiple spectrometers is often prohibitive, due to either cost or mechanical 
constrains. Therefore, wavelength scanning of the light source is used instead 
to acquire the chromatic confocal signal. Nevertheless, such process can be 
time-intensive depending on the scanning speed of the light source. 


Instead of an equidistant measuring scheme, Bayesian experimental design 
allows for an adaptive measuring scheme, where the location for a new mea- 
surement is determined by measurements already conducted. For example, 
when the intensities of several wavelengths have already been measured, the 
question that BED attempts to answer is which wavelength should be mea- 
sured next so that the estimation could be made most efficiently. Such ap- 
proach fits naturally to the post-measurement refinement of the AdaScope, as 
individual localized axial positions can be scanned dynamically. 


The profit generated by a new measurement at a certain wavelength is de- 
scribed by a utility function over the design space. There are various different 
utility functions which focuses on different aspects of the design. Chaloner 
and Verdinelli [Cha95] have presented an overview of Bayesian optimal de- 
sign and discussed appropriate choices for the utility function. For parameter 
estimation, a common choice is the expected Shannon information gain. The 
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additional information gained through the new measurement is represented 
by the Kullback-Leibler (KL) divergence between the current posterior dis- 
tribution and the updated posterior distribution after the new measurement. 
The utility function is expressed as the expectation of the KL divergence under 
the posterior predictive distribution: 


UC) = Egje [Dri (p (0|G,8,8) | p (0|G))] 


-| p(0|G,8,8) log p(@|G) 


z | | p(0|G) p(gl8,&) 


10er (0.6.9 - tog] | p 016.2 p («0.6.9 a0] a0 ag, 


dO p(g|G,) dg 
(4.22) 


where č represents the possible designs, i.e., the next wavelength to be mea- 
sured. 


Calculation of the double integral for this utility function cannot be conducted 
analytically and therefore is solved by a nested Monte Carlo (MC) approxima- 
tion using posterior samples drawn for parameter estimation [Rya03]. 


A iX u 1M B 
U@) = Unm = 5 5 oxo eo'e a es Speal) (4.23) 
i=1 j=l 


where {0'} u {0} are drawn from p(0|G) and {g'} are drawn from p(g|0',é). 
The numbers of samples to be drawn are controlled by N and M. 


Finally, the task is to find the ë which maximizes the utility function above. 


& = arg max U(é) (4.24) 
&€[0,1] 
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Although there are stochastic optimization techniques for such a problem, 
the utility function is typically calculated for a grid of discrete design points 
and the design point with the highest utility is selected for the next measure- 
ment. Notice that this approach is based on the so-called myopic design. It 
means that only one further step is considered based on the current situation. 
This does not guarantee true optimal design for an experiment with multiple 
measurements, but in general works very well as a greedy method. 


As an example, Figure demonstrates the adaptive measurement of a chro- 
matic confocal signal. The first column shows the signal to be measured and 
the corresponding measurement in each step. In these graphs, g denotes the 
signal to be measured, g’ represents the new measurement in each step and G 
contains all measurements conducted previously. The second column shows 
the utility function over the design space in each step. In this example, mea- 
surement starts by recording intensity of the wavelength in the middle. Based 
on the measurement result, parameter estimation is conducted and the util- 
ity function over all wavelengths is calculated. In the next step, intensity is 
measured at the wavelength which has the largest utility value. These two 
steps can be repeated multiple times until the utility function become close to 
zero for the whole design space, indicating that new measurements no longer 
bring any additional information. The wavelength is normalized to a range 
from zero to one as the calculations are all based on simulations. 


Figure shows the comparison between the uniform measurement scheme 
and the adaptive measurement scheme based on BED. As seen from the pos- 
terior samples, with the same number of measurement steps, the adaptive ap- 
proach typically generates much more concentrated samples, indicating less 
uncertainty for the parameter estimation. The reason is that the adaptive ap- 
proach tend to make new measurements at locations where more information 
is expected to be gained regarding the parameters. 
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Figure 4.25: Adaptive measurement of a chromatic confocal signal. Left column: each measure- 
ment step. Right column: utility function after each measurement step. 
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Uniform Sampling A Posterior Samples 
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Figure 4.26: Comparison between uniform measurement and adaptive measurement. First row: 
uniform measurement and its corresponding posterior estimation. Second row: 
adaptive measurement and its corresponding posterior estimation. 


4.4.3 RNN-based Acceleration 


As discussed in Section the utility function in Bayesian experimental 
design can be approximated by a nested Monte Carlo method shown in Equa- 
tion One major disadvantage of this approach is its slow speed. The 
nested MC approximation of the utility function shown above is only asymp- 
totically unbiased as an estimator of the utility function. The bias and the vari- 
ance of the estimator depends on the number of posterior samples. As shown 
in previous study [Ryao3], the variance can be represented as A,(&)/N + 
As(E)/(NM) and the bias can be represented to the leading order by A3(é)/M, 
where A; are terms depending on the sampling distribution and M and N 
control the numbers of samples to be drawn in the nested MC procedure as 
defined in Equation The number of samples needed for experimental 
design is naturally much larger than that for pure inference. To make things 
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even worse, the inner loop of this nested MC has to be performed for each de- 
sign candidate individually. Due to these reasons, even with faster computers 
nowadays, full Bayesian experimental design is only implemented in limited 
fields, such as pharmaceutical studies and astronomy. What’s common about 
these fields is that although the model behind is often very complex, the time 
interval between two experiments are also very long, thus allowing a good 
design to be found in a Bayesian way. 


Dynamic localized measurement of the chromatic confocal signal is exactly 
the opposite. Real-time decisions have to be made based on a relatively simple 
model. If the design speed is not fast enough, it would be more efficient to 
simply scan the whole wavelength range like a spectrometer. To accelerate 
the Bayesian experimental design process, a specific type of neural network, 
i.e., the recurrent neural network, can be trained as an approximation. 


The inspiration for using this model originates from a recent topic in computer 
vision community, namely the Visual Attention Model [Ba1]. For pattern 
recognition task, the researchers try to mimic the human vision system us- 
ing a recurrent network. Instead of performing classification on the complete 
image, a small image patch is processed by the RNN and the output is both 
the classification result and where to look next. The training is implemented 
with reinforcement learning. It seems quite obvious that the visual attention 
model and Bayesian experimental design share an incredible amount of simi- 
larities as both attempt to gain more information through a series of adaptive 
measurement/observation. 


For a conventional feed-forward neural network with a single hidden layer, 
the propagation of data can be expressed as: 


s = f,(W,x + b,) 


(4.25) 
o = f,(W,s + by) 


where x denotes the input signal, s and o represent the activation of the hidden 
layer and the output layer respectively. W, and W, are matrices containing 
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Figure 4.27: Graph representations of the feed-forward neural network and the recurrent neural 
network. 


weights describing the connections between the layers. The non-linear acti- 
vation functions are represented by f, and fọ, which can have various forms. 
And b, and b, stand for the biases. More layers can be added to form more 
complex networks. 


A recurrent neural network is capable of “memorizing” the previous input 
data due to the introduction of a feedback loop in the hidden layer. Although 
more sophisticated variations have been developed, the simplest form of an 
RNN can be expressed as: 


St = fs(W;x; + Wis,-ı + bs) (4.26) 
0; = fo(W os; + bo) 


where t stands for the time-stamp and W, is a matrix describing the weights 


of the feedback loop. 


To train an RNN for the approximation of Bayesian experimental design, a 
series of experiments are simulated based on the measurement model and 
full Bayesian experimental design. Each simulated experiment consists of ten 
measurement steps of one chromatic confocal peak. The measurements and 
the corresponding utility functions are stored as training data for the RNN, 
which can be expressed in the following form: 
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l; = WA; + bj 
m; = Wm; + bm 
s; = l; °- m; (4.27) 


k; = LSTM(Wx, St, St-15 bg) 
0; = ReLU(W ok; + bo) 


where l; is a hidden layer with 200 neurons to encode the measurement lo- 
cation, m; is a hidden layer also with 200 neurons to encode the measured 
intensity. The measured wavelength at this time step is denoted A, and the 
measured signal at this time step is represented by g,. The layer s, merges 1; 
and m; by taking element-wise multiplication with the Hadamard operator. 
The layer k; is a sophisticated recurrent layer, namely the Long Short-Term 
Memory (LSTM) [Hoc97], which memorizes information from previous mea- 
surement steps of an experiment. The output layer is denoted by o, with the 
rectified linear unit (ReLU) as the activation function. The weights and biases 
of each layer are represented by Wọ.) and by.) respectively. The collection of 
weight matrices for the LSTM layer is denoted by Wx. The target of training 
is to find the weights and biases which best fits the simulated experiments and 
is conducted through an RMSProp optimizer with the objective of minimiz- 
ing the mean squared logarithmic error. The whole process is implemented in 
Python based on Tensorflow and Keras [Cho15], and is computed us- 
ing GTX 1050 graphics card by Nvidia. The training takes a couple of hours, 
but during measurement, the feed-forward calculation of an RNN is much 
faster than full Bayesian experimental design which requires multiple nested 
MC sampling. 


As a comparison, 300 experiments of chromatic confocal measurements are 
simulated using three approaches: full Bayesian experimental design, approx- 
imation using RNN, and equidistant measurement. Parameters of the signal 
are drawn randomly. As can be seen from Figure measurement with 
Bayesian experimental design has a lower mean absolute error compared with 
an equidistant measurement method when the number of measurements are 
equal. The approximation by the recurrent neural network does not perform 
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Figure 4.28: Comparison of uniform sampling, Bayesian experimental design, and Bayesian ex- 
perimental design simulated through RNN. 


as well as the Bayesian experimental design, due to the errors generated in 
the utility functions. However, it still yields a lower mean absolute error for 
parameter estimation compared with the equidistant measurement scheme. 


Conventional feed-forward neural network with even just a single hidden 
layer is proven to be a universal approximator [Cybs9], which indicates that 
any function can be approximated by a neural network with a single hidden 
layer as long as the layer is large enough. The RNN is even more powerful 
and has been proven to be Turing-complete [Sie95]. While the training of 
the feed-forward neural network can be seen as optimization over functions, 
the training of the recurrent neural network can be seen as optimization over 
programs. There theoretically exists one RNN which perfectly approximates 
Bayesian experimental design of a specific model. 


For MCMC-based BED, the sampling of the parameters with N = 640 and 
M = 999, which equals 640000 samples, takes roughly 18s. With 101 number 
of designs, one sample of measurement result is drawn for each parameter pair 
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0, which takes also roughly 18 s. The total number of measurement samples is 
therefore 640000 x 101 = 64640000. The sampling process takes a significant 
amount of time as the process is based on CPU (Intel i5-4210M). On the con- 
trary, the RNN-based BED is implemented on GPU (Nvidia GTX 1050) and the 
complete network consists of 341901 trainable weights and biases. A single 
inference of all utility values for one measurement is merely 28 ms, which is 
more than 600 times faster than the MCMC-based BED. 


4.4.4 Summary 


For post-measurement refinement, a localized scan can be performed in the 
vicinity of the peak position based on the measurement result from the pre- 
vious measurement stage. Although the scan can be implemented through 
uniform sampling, a more efficient dynamic sampling approach is proposed 
based on Bayesian experimental design. Based on the intensity measurement 
of several wavelength positions, the new wavelength position to be measured 
is determined through the computation of the utility function over possible 
wavelength positions. The wavelength with the largest utility value indicates 
the highest expected information gain if measured next. Although the com- 
putation can be implemented based on MCMC sampling, the speed of the 
computation is limited due to the large number of samples required. To ac- 
celerate the process, an RNN is developed and trained based on simulated 
experiments to approximate the computation of the utility function. The per- 
formance of the RNN-approximation is lower than full BED but higher than 
the uniform sampling, while the speed of the computation is 600 times faster 
than the MCMC-based BED. 
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This chapter presents an evaluation of the measurement methods introduced 
previously within the context of the cascade measurement strategy. In 
Section 5.1, the configuration and characteristics of the experimental setup 
are presented. To evaluate the proposed measurement techniques, several 
benchmark measurements are performed based on conventional array 
confocal scanning and the results are presented in Section Section 
discusses results from the compressive shape from focus method, which is 
capable of locating the rough position of the object position in a wide axial 
range using a minimum number of frames. In Section 5.4, measurement 
results of the iterative array adaptation method are analyzed, where two 
iterations of measurements are performed. The results from the direct area 
scanning method are demonstrated in Section And in Section results 
from various methods are summarized and compared. 


5.1 Experimental Setup 


The spatial DMD, the camera sensor and the microscope arms are carefully 
aligned so that the camera image covers the complete effective area of the 
DMD. Relative planar rotation between the two coordinate systems are also 
minimized (ignoring the intrinsic 180° rotation) through alignment. Figure 
shows an image of the 1951 USAF Resolution Test Target from Thorlabs 
(R1DS1P). Full-field illumination is applied by switching on all pixels on the 
spatial DMD. The illumination spectrum is a Gaussian function centered 
around 555 nm with a FWHM of 1 nm. The rectangular area which is slightly 
bright indicates the illumination area on the test target. According to result 
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of the OpticStudio simulation in Section with diffraction limited optics, 
the spot size of the optical system is smaller than the pixel size of the camera 
even at the maximum wavelength of 680 nm. Consequently, resolution of the 
imaging system can be considered to be instrument limited. At 555 nm, the 
paraxial magnification equals 0.373, which leads to an object-side pixel size 
of 2.4 um. The theoretical resolution limit of the system can be calculated to 
be 208.3lpmm7!. As shown by the zoomed patch on the right of Figure 
the system is capable of resolving test patterns until element #4 of group #7. 
This element indicates a resolution of 181lpmm!, which closely matches 
the simulation results. Therefore, optical alignment is considered to be close 
to ideal. 


Figure 5.1: Image of 1951 USAF Resolution Test Target from Thorlabs (R1DS1P). Illumination 
spectrum is a Gaussian function centered around 555 nm with a FWHM of 1 nm. 


A camera calibration procedure is first implemented (Section B.2.3), after 
which several experiments are made based on the conventional array 
scanning method as a benchmark and the cascade measurement strategy. 
Two test targets are applied for the measurement experiments. An optical 
mirror is used to calibrate the image field of the optical system. Additionally, 
the sensitivities of various methods are characterized through a series of 
measurements. Secondly, a two-Euro coin is selected as a test target to 
demonstrate the capability of AdaScope in a practical scenario. 


The system is focused on a small area on the two-Euro coin with the letters 
E and U. This area is chosen due to its complex structure. There are three 
major levels of height in this area. The bottom surface and the top surface of 
the letter face are both designed to be flat, whereas the middle surface of the 
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European map has a wavy profile packed with small indentations. All results 
are laterally presented in the coordinate system of the spatial DMD. 


Figure 5.2: Test target for experimental investigation. (Source: from Pixabay under CCO Cre- 
ative Commons) 


5.2 Benchmark: Confocal Array Scanning 


To provide a benchmark for the proposed methods, conventional array scan- 
ning is implemented with AdaScope to provide an accurate but slow measure- 
ment of the target. For each pinhole array generated by the spatial DMD, a 
series of images are captured while the illumination wavelength is scanned. 


As shown in Figure with an axial shift of 95.25 um from the focal posi- 
tion, the blurred focal spot of a single DMD pixel easily reach a distance of 
15 pixels while still maintaining sufficient energy for crosstalk. As a well 
optimized imaging system, the spherical aberration of the chromatic objec- 
tive is aggressively corrected in order to reduce the spot size. Nevertheless, 
such aggressive correction often leads to a blurred spot distribution where 
more energy are concentrated in the outer area, making the system more sus- 
ceptible to crosstalk. With the current setup, a pitch distance of 10 pixels is 
considered the minimum for acceptable measurement, while a pitch distance 
of at least 20 pixels is needed to suppress crosstalk to a minimum level. For 
the benchmark measurement, a pitch distance of 20 pixels has been selected, 
which corresponds to 400 lateral scans per axial position. 
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Figure 5.3: Exemplary spatial distribution of the blurred light at a distance of 95.25 um from the 
focal plane, plotted in the pixel coordinate system of the spatial DMD. 
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Figure 5.4: With an illumination bandwidth of 1 nm, the axial response has a FWHM of 47.5 um. 


For the axial direction, the response of the system can be recorded by scanning 
through a series of illumination wavelengths. As illustrated in Figure 5.4, with 
an illumination bandwidth of 1 nm, the axial response has a FWHM of 47.5 um. 
To accurately locate the position of the response peak, the axial step size has 
to be at most half of the FWHM of the axial response, which corresponds to 
a number of 196 axial steps for the complete measurement range. To provide 
a more accurate benchmark measurement, an axial step of 10 um has been 
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implemented for a smaller axial range of 0.5 mm, which is more than sufficient 
to cover the profile variation of the euro coin. 


Figure 5.5: Confocal signal for the axial position #20, #23, #26 and #29. Corresponding illumi- 
nation wavelengths are labeled in red. 


Figure 5.5 demonstrates the measured signal at several axial positions for ex- 
ample. As can be seen, due to the confocal filtering, only areas which are in 


focus return a high intensity in the captured image, while the out of focus 
areas are close to black. 


Gray Value 


Figure 5.6: Extended depth of field image of the test target. 
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In post processing, by taking the maximum axial intensity for each lateral 
location, an extended depth of field image can be reconstructed, demonstrat- 
ing the surface texture of the test target (Figure 5.6). For areas where the 
height changes drastically, such as the edge of the two letters, the intensity 
drops heavily due to self-occlusion. Confocal measurements for such areas 
are considered not reliable and are excluded based on a threshold set for the 
maximum intensity. In the following figures, such areas are illustrated with 
a white color. 


Height Map 


1.58 


Figure 5.7: Height map reconstructed from the confocal signal of array scanning through Gaus- 
sian fitting. 


To retrieve the height information of the test target, Gaussian fitting is im- 
plemented for the confocal signal, in the form of g = Aexp(-(z - 1)?/(20?)) 
where £ represents the expectation of the gray value in the captured image 
and z represents the axial focus position of the corresponding wavelength. 
The center position of the Gaussian peak p is considered as the height of the 
object, while o is directly related with the FWHM of the signal peak. The 
fitting process is performed only for five data points around the maximum 
axial intensity where the signal to noise ratio is highest. Figure 5.7 shows the 
height map of the target sample reconstructed from the confocal signal with 
the colorbar scaled to show a height range of 120 um. As a benchmark mea- 
surement, the result of the conventional array scanning is very accurate both 
in the lateral directions as well as in the axial direction. The three layers of the 
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coin surface are faithfully reconstructed. Structural defects such as scratches 
can be easily located from such a height map. As a well used coin, the height 
difference between the top surface of the “EU” letters and the base surface of 
the coin is approximately 110 um. 


Despite the good accuracy and resolution, the measurement speed of the con- 
ventional array scan is relatively slow. Even for the limited axial measurement 
range implemented in this experiment, a total of 49 x 20 x 20 scans have to be 
conducted, which would take more than 10 minutes to complete for a camera 
capable of 30 fps. For the complete measurement range with more coarse ax- 
ial step, the measurement time would be at lease four times longer, limiting 
the application of such method to only situations which are not time-critical. 


5.3 Pre-measurement: Compressive Shape 
from Focus 


In this section, measurement results of both the conventional SFF and the 
compressive SFF are analyzed and compared. The compressive SFF method, 
with its much faster measurement speed, serves as the pre-measurement in 
the AdaScope system, based on which the main measurement can be initial- 
ized. 


As discussed in Section 44.1, conventional shape from focus method has a lim- 
ited measurement speed especially in a high NA system, due to its require- 
ment of axial focal plane sampling as densely as possible. With an NA of 0.33, 
the depth of field of the AdaScope is very small compared to the axial range of 
measurement. To make sure all relevant axial positions are covered, a series 
of 196 spectra are generated for the AdaScope system, each with a FWHM 
of 1nm. The step size of the wavelength are chosen nonlinearly to counter 
the nonlinear chromatic aberration, which leads to an axial step size of 23 um. 
Figure b.a] illustrates several example frames from the focal stack with their 
corresponding sharpness measurement calculated using the modified Lapla- 
cian operator. As shown by the image, the sharpness measure reaches a peak 
when the underlying location is in focus. By tracking the axial location where 
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the sharpness measure reaches maximum, the surface profile of the target ob- 
ject can be reconstructed. 


For the compressive SFF measurement, a series of measurement filters have to 
be constructed so that the focal stack can be acquired in a compressive man- 
ner. In the simulation presented in Section such filters are constructed 
through PCA of various training samples based on different surface profiles 
and textures. In practice, the SNR of the training samples constructed from 
real data are not high enough generate reliable filters. Therefore, synthetic 
training samples are generated based on an axially shifted Gaussian signal. 
The first three channels of the generated filters are illustrated in Figure 5.9 


As the filters contain both positive and negative values, each filter are real- 
ized with two separate filters in practice. The filter weightings are achieved 
directly through the illumination spectra coupled with the axial chromatic 
aberration. The intensity of the illumination spectra and the camera exposure 
time are adjusted accordingly so that the dynamic range of the camera is fully 
utilized without saturation. For 14 compression filters, 28 images are cap- 
tures and the final compressed frames are calculated through the subtraction 
between the positive images and the negative images (Figure 5.10). 


After the compressed focal stack is captured, the focus measure of the full 
focal stack can be reconstructed with the algorithm presented in Figure kd 
Similar to the conventional SFF method, the surface profile of the target object 
is reconstructed by tracking the axial location where the sharpness measure 
reaches maximum. 


Figure illustrates the reconstructed height maps from both the conven- 
tional SFF method and the compressive SFF method. Result from the conven- 
tional SFF method is based on a focal stack containing 196 images captured 
at different axial positions. The height map clearly demonstrates the base 
surface, the top letter surface and the wavy European flag surface in the mid- 
dle. However, when examined closely, small defects occur throughout the 
measurement field in a random manner, mainly due to the fact that the SFF 
method is generally vulnerable to camera noise and the accuracy of the mea- 
surement depends strongly on the texture of the target surface. Additionally, 
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Figure 5.8: Example frames from the focal stack with their corresponding sharpness measure 
results. For the sharpness measure result, a brighter grey value represents a higher 
degree of sharpness. 


121 


5 Evaluation and Results 


Linear Compression Filters 


Channel 1 
Channel 2 
Channel 3 


Weight 


Axial Position / mm 


Figure 5.9: Linear compression filters of the first three channels. 


the lateral smoothing procedure required when computing the focus measure 
also degrades the lateral resolution of the system. Overall, the conventional 
SFF method is considered much less reliable than the conventional confocal 
scan for microscopic surface profilometry, and is thus seldom applied in an 
industrial environment. 


In the AdaScope system, instead of spending so much frame resource on con- 
ventional SFF method, which is not capable of delivering robust result, the 
compressive SFF method is utilized to provide a pre-measurement, based on 
which the main measurement stage can be initialized. With the prior knowl- 
edge that the axial region of interest is significantly smaller than the com- 
plete axial measurement range, the task of the pre-measurement range is to 
limit the axial measurement range in the main measurement stage so that the 
overall measurement efficiency can be improved. As shown in Figure 
although the height map reconstructed from the compressive SFF method is 
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Camera Frame Sharpness Measure 


Figure 5.10: Example frames from the compressively captured focal stack with their correspond- 
ing sharpness measure results. A polar colormap is utilized as the frame contains 
negative values, which are indicated by the blue color. For the sharpness measure 
result, a brighter grey value represents a higher degree of sharpness. 
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Figure 5.11: Comparison of SFF and CSFF results. Top row: height maps. Bottom row: height 
histogram. 


more severely corrupted by noise compared to the conventional SFF result, 
the histogram of the height map clearly indicates that a large number of pix- 
els are centered around 1.625 mm. By tracking the peak position of the height 
histogram, compressive SFF method provides a good guidance on the axial 
location and range for the main measurement stage with a minimum number 
of image captures, which is 7 times faster than the conventional SFF method. 


5.4 Main measurement I: Iterative 
Array Adaptation 


This section presents the measurement result from the iterative array adapta- 
tion method, which is considered as one candidate for the main measurement 
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stage. Based on the result from the compressive shape from focus measure- 
ment, the axial measurement range is limited to a length of 464 um centered 
around the axial position of 1.625 mm. Two iterations of the array adaptation 
are implemented. 


Illumination Distribution for Itr #1 
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Figure 5.12: Illumination channels for the first iteration. 


For the first iteration, a grid with a pitch distance of 20 pixels is scanned later- 
ally, while the axial position is scanned through two channels of illumination 
spectra as shown in Figure According to Equation k4 the normalized 
centroid position of the confocal peak, which also represents the axial po- 
sition of the target, should be easily retrieved by calculating the normalized 
centroid position of the two channel signal. Nevertheless, due to the crosstalk 
between adjacent measurement locations as well as the asymmetric blurring 
of the out of focus light, the two centroid positions are only linearly related 
in a limited range. 


Therefore, the relationship between of the measured signals and the axial po- 
sition of the target must be calibrated (Figure 5.13). The calibration is imple- 
mented with a broadband mirror, which is mechanically scanned while the 
two channels of the signal are recorded. As can be seen from the calibration 
result, strong non-linearity starts to appear at the edges of the measurement 
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Figure 5.13: Signal calibration for iteration #1. A broadband mirror is mechanically scanned 
through the measurement range of iteration #1 while the signal of the two channels 
are recorded. The linear fitting is conducted while the 5 starting data points and 5 
ending data points are excluded. 


range and thus the target should be placed within the central linear region to 
avoid measurement defects. 


Figure shows the raw signals from iteration #1, which are reassembled 
from the array scanning result. Since the measurement is initialized based on 
the result from the compressive SFF method, the axial measurement range is 
chosen in a way that the target lies as close to the middle of the measure- 
ment range as possible. This results in relatively similar intensity levels in 
both channels. Nevertheless, the signal is sensitive enough to retrieve three 
dimensional information from even just two channels. For the higher regions 
of the “EU” letter face, the signal from channel #2 is clearly higher, while 
for the lower regions of the coin base surface, the signal from channel #1 is 
higher. By calculating the normalized centroid position of the two-channel 
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Figure 5.14: Raw signal reassembled from array scanning in iteration #1. 


signal and mapping through the calibration curve, the surface profile can be 


reconstructed (Figure 5.15). 


Even with just two channels of measurement, the reconstructed height map 
of the target clearly indicates three layers of structures. Nevertheless, mea- 
surement defects can also be seen across the measurement field and the mean 
absolute error compared to the result of array scanning from Section [.2) is 
15.3 um. 


The original array adaptation method proposed in Section applies a bi- 
nary search approach in the axial direction from iteration to iteration. Never- 
theless, as shown previously, the edge of the axial measurement range often 
demonstrates strong non-linearity in practice, which severely affects the re- 
sult of the binary search process. To counter this problem, the linear measure- 
ment filters are chosen to cover an axial range larger than the target height 


127 


5 Evaluation and Results 


Height Map - Itr #1 


1.7 

1.68 
1.66 
1.64 


ht / mm 


eig 


1.62 


a 
H 


1.58 


Figure 5.15: Reconstructed height map from iteration #1. 


variation so that the target does not enter the edge of the measurement range. 
Meanwhile, in the second iteration, instead of dividing the axial measurement 
range from the middle, the dividing point is determined by the median value 
of the reconstructed height map from iteration #1. As shown in Figure 
after the division, the mean value is calculated for each half based on the 
reconstruction result from iteration #1. The new measurement filters are cen- 
tered around the two mean values while covering half of the original axial 
range. In this way, it is guaranteed that the target always lies as close to the 
center of the axial measurement range as possible. And such procedure can 
be implemented for further iterations as well. 


In the second iteration, the lateral pitch is reduced to 10 pixels. And similar 
to iteration #1, signal calibration has to be implemented for the reduced pitch 
distance and axial measurement range. Figure illustrates the reconstruc- 
tion result from iteration #2. Although measurement artifacts still exist, the 
mean absolute error with respect to the confocal array scanning has dropped 
to 14.7 um. 
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Figure 5.16: The histogram of the height reconstructed from iteration #1 can be divided into two 
halves which are measured by different linear filters in iteration #2. The span of the 
x-axes in both figures represents the axial range of iteration #1. 
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Height Map - Itr #2 


Figure 5.17: Reconstructed height map from iteration #2. 
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5.5 Main Measurement II: Direct Area Scanning 


5.5 Main measurement II: Direct 
Area Scanning 


This section presents the measurement result from the direct area confocal 
scanning method, which is considered as another candidate for the main mea- 
surement stage. Based on the result from the compressive shape from focus 
measurement, the axial measurement range is limited to a length of 501 um, 
which is centered around 1.625 mm. The range is slightly different from the 
axial range used in the array adaptation method to guarantee the optimum 
tilting angle of the illumination field according to the lateral pitch of the spa- 
tial DMD and the magnification of the imaging system. 


Figure 5.18: Left column: raw camera frames from direct area scanning. Right column: confocal 
signal through reordering of the raw signal. 
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Figure illustrates the captured signal while the periodic tilted illumina- 
tion field is scanned laterally. The left column shows exemplary raw frames 
from the camera, while the right column shows images in which the signal 
are reordered so that each frame represents the signal of a certain wave- 
length. Compared to the confocal signal from the conventional array scan- 
ning method shown in Figure 6.3, the background level is clearly higher, due 
to crosstalk from the adjacent locations in such a wide-field setup. Never- 
theless, with the help of the tilting angle, a confocal peak is still visible as 
different areas get brighter when imaged in focus. 


Similar to the post processing for conventional array scanning, Gaussian fit- 
ting is also implemented for the area scanning signal in a window centered 
around the maximum intensity. Figure illustrates the measurement re- 
sult using the proposed area scanning method, where several differences can 
be demonstrated. 


Height Map 


1.58 


Figure 5.19: Height map reconstructed from the confocal signal of area scanning through Gaus- 
sian fitting. 


Compared to the measurement result of iterative array adaptation (Fig- 
ure 5.15), the measurement result is clearly much smoother and very close 
to the benchmark measurement by the array scanning method (Figure 5.7). 
As a quantitative comparison, the mean absolute error with respect to the 
benchmark measurement has further dropped to 8.9m. Despite these 
differences, the measurement result using the proposed method is completely 
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usable, revealing the three structure layers of the coin accurately. Center 
areas of the indentation holes can be measured without being much affected 
by the shadows around. The wavy profile of the European map is also 
truthfully recovered. 


—— Array (20 pixels) 

—— Array (10 pixels) 

—— Array (5 pixels) 
Direct Area Scanning 


Normalized Intensity / arb. unit 
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Figure 5.20: Comparison of confocal signals from array scanning and direct area scanning. 


A comparison of the confocal signal from the array scanning method and 
the confocal signal from the direct area scanning method is presented in Fig- 
ure All signals are extracted from the same location on the coin in a 
relatively flat area which is located at the top left corner of the FoV. For the 
array scanning method, as the pitch increases, the background noise arises, 
leading to a wider confocal peak. The signal shape of the direct area scan 
method lies between the array scan methods with 10 pixels and 5 pixels. 
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Figure 5.21: Histogram comparison of the fitting result for o. For the array scanning method, 
the pitch distance is 20 pixels. 


Figure B.2]]illustrates the histograms of the fitting result of o for all measure- 
ment positions on the coin. The FWHM of the fitted Gaussian peak can be 
calculated with FWHM = 2V2In2o = 2.350. For the conventional array scan- 
ning method with a pitch distance of 20 pixels, the fitted o values have an 
average of 18.2 um, ie. 42.7 um in terms of FWHM. For direct area scanning, 
the mean fitted o value equals to 39.1 um, which corresponds to a FWHM 
of 91.9um. In both cases, the measured axial responses are wider than the 
theoretical analysis. Multiple factors are responsible for this effect. First and 
foremost, the illumination Gaussian spectra have a certain bandwidth instead 
of being a Dirac pulse. The axial spanning due to the chromatic aberration 
for the selected spectral bandwidth (1 nm) in the current measurement range 
alone amounts to approximately 20 um. Secondly, various optical aberrations 
in the practical system, the calibration error of the camera as well as camera 
noises all contribute to the broadening of the intensity peak. Thirdly, the lat- 
eral extension of the spatial DMD pixel also increases the illumination spot 
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size which leads to a wider axial FWHM. Additionally, the periodic illumina- 
tion pattern used in the direct area scanning method also contributes to the 
cross-talk, which raises the background level of the signal, leading to a wider 


peak. 


5.6 Analysis and Comparison 


In this section, the accuracy ofthe two main measurement methods are com- 
pared against the array scanning method with fixed pitch. As trivial as it may 
seem, the characterization of the accuracy of a measurement system is a very 
complex task. The errors of the measurement can be categorized into two 
classes: random error and systematic error. Although the random error can 
often be characterized in a more straightforward manner through statistical 
analysis, the systematic error is much more challenging to quantify and could 
vary from object to object. In practice, the combination of both contributes to 
the final measurement error, making it difficult to separate one factor from the 
other. While the camera noise and the instability of the illumination source 
contribute most to the random error, the systematic error mostly originates 
from the crosstalk between the adjacent measurement locations. 


To test their robustness against the random error, a series of experiments are 
conducted for each method with a broadband mirror as the test target. The 
mirror is chosen as the target due to its simple structure, which helps to sup- 
press and simplify the effect of crosstalk. A total of 5184 lateral locations uni- 
formly spread across the measurement field are observed through 25 repeated 
measurements. The standard deviations ofthese measurements are used as an 
indicator of the random part ofthe measurement uncertainty. The minimum, 
average and maximum standard deviation values of all lateral locations are 


listed in Table 


Several insights can be gained through this table. First of all, for the array scan 
method with fixed pitch, the measurement uncertainty generally increases 
as the pitch distance decreases. The influence of the crosstalk can be clearly 
demonstrated from these three different configurations. The level of crosstalk 


135 


5 Evaluation and Results 


Table 5.1: Uncertainty and speed comparison of various methods. The results are for a broad- 
band mirror as the test target. 


Method Uncertainty Frames per 
min avg max Measurement 

Array scan with fixed pitch 

Pitch: 20 pixels 0.16um 0.49um 1.37um 19600 

Pitch: 10 pixels 0.15um 0.53um 2.12um 4900 

Pitch: 5 pixels 0.15um 0.57um 5.69 um 1225 

Array scan with iterative array adaptation 

Iteration 1 2.21um 4.65um 8.89 m 800 

Iteration 1+2 13l1um 3.12um 6.00um 1000 

Direct area scan 

Period: 49 pixels 0.19um 0.68um 2.39 um 49 


is not uniform across the measurement field even for a mirror because the PSF 
varies laterally due to optical aberrations in the system. The minimum uncer- 
tainty values remain almost unchanged, since it represents the best case in 
all three configurations where the crosstalk level is very small. In such situa- 
tions, the uncertainty values come from the measurement of a pure confocal 
peak signal with minimum crosstalk from its neighbors. On the contrary, the 
maximum uncertainty values vary dramatically, as they represent the worse 
case scenario for the three configurations, in which the crosstalk level is di- 
rectly linked with the pitch distance. Secondly, the direct area scan method 
is clearly better than the iterative array adaptation method in terms of un- 
certainty and has a performance close to the area scan method with a pitch 
distance of 10 pixels. 


The data listed in Table Bis not sufficient to generate a conclusion regarding 
the accuracy of the various methods since it does not take all systematic errors 
into consideration. For example, the uncertainty values in the table could lead 
to a false impression that the iterative array adaptation method has the worst 
performance of all listed methods, which is not true. Figure 5.22jillustrates the 
measurement result of the coin using array scan method with a pitch distance 
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1.58 


Figure 5.22: Height map reconstructed from the confocal signal of array scanning with a pitch 
distance of 5 pixels through Gaussian fitting. Errors in measurement due to signal 
crosstalk are easily visible. 


of 5 pixels. The reconstructed height map contains a large amount of sys- 
tematic error due to crosstalk, which makes the measurement quality much 
inferior to result of the iterative array adaptation method (Figure 5.17). 


Table 5.2: Mean absolute bias of various methods with respect to confocal array scanning with 
a pitch distance of 20 pixels. The test target is the coin. 


Frames per 


Methods Mean Absolute Bias 

Measurement 
Array scan with fixed pitch 
Pitch: 20 pixels N/A 19600 
Pitch: 10 pixels 20.9 um 4900 
Pitch: 5 pixels 49.7 um 1225 
Array scan with iterative array adaptation 
Iteration 1 15.3 um 800 
Iteration 1+2 14.7 um 1000 
Direct Area Scan 
Period: 49 pixels 8.9 um 49 


The measurement of array scan with a pitch distance of 20 pixels generates 
the best result due to its low uncertainty and crosstalk, which has been used 
as a benchmark in previous sections. With this measurement result as the 
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reference, the mean absolute biases of other methods and configurations can 
be computed, which are listed in Table Both main measurement method 
candidates generate a lower bias compared to array scan with a pitch distance 
larger than 10 pixels, indicating a smaller amount of crosstalk. In particu- 
lar, the direct area scan method yields a result closest to the array scanning 
method with a pitch of 20 pixels. 


Combining both the uncertainty and the bias, it is clear that the direct area 
scan method is the better one from the two candidate methods for the main 
measurement. With the lowest bias and a very low level of uncertainty, the 
performance of the direct area scan method is at least as good as the array scan 
method with a pitch of 10 pixels. With just 49 frames per measurement for 
the limited measurement range, it achieves a speed improvement of at least 
100 times while maintaining a comparable measurement accuracy. 
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The work presented in this dissertation aims to achieve high-speed surface 
profilometry based on an adaptive microscope with axial chromatic encoding. 
In this closing chapter, the results of the work are summarized and an outlook 
is presented for future research topics. 


6.1 Conclusion 


Conventional optical surface profilometry technology is dominated by the 
dilemma between measurement speed and accuracy. Shape from focus meth- 
ods are very efficient in terms of information acquisition but is restricted by a 
low accuracy in both the lateral and the axial directions. Despite the high res- 
olution and accuracy of the confocal systems, the measurement suffers from 
a slow speed due to its dependence on scanning. Even with state-of-the-art 
array scanning methods, a minimum pitch between adjacent measurement 
locations must be maintained to avoid crosstalk. To tackle this problem, a 
holistic approach has been taken to design and develop an adaptive micro- 
scope, i.e., the AdaScope, together with a cascade measurement strategy. 


The AdaScope is composed of two subsystems. To begin with, the pro- 
grammable light source is developed based on a supercontinuum laser. A 
cross disperser pair of a dispersion prism and an echelle grating has been 
constructed to generate the 2D dispersion pattern of the laser spectrum, i.e., 
its echellogram, which is projected onto a DMD. By switching the micro 
mirrors on the DMD individually, arbitrary spectra can be generated and 
collected by the liquid light guide for output to the next stage. Through 
careful alignment and calibration, the programmable light source is able 
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to generate Gaussian peak spectra with a FWHM smaller than Inm. And 
when used as a scanning source, a step size as accurate as 0.01nm can be 
achieved. The second subsystem is a programmable array microscope based 
ona second DMD. Light from the programmable light source is homogenized 
and projected onto the second DMD for spatial filtering. Each pixel on the 
second DMD serves as a secondary light source which can be individually 
addressed. A chromatic objective is applied in the microscope to achieve axial 
chromatic encoding. Through combination of the illumination spectra and 
the spatial DMD pattern, any location within the 3D measurement volume 
can be addressed through a localized illumination field. The combination 
of the two subsystems forms a flexible and adaptive microscopic system, 
allowing different measurement principles to be implemented and analyzed. 


Based on the AdaScope platform, a cascade measurement strategy has been 
proposed. Multiple measurement methods can be combined to perform a 
complete measurement task, where raw and fast measurement result in one 
stage is used to initialize slower but more accurate measurement in the next 
stage. By combining the advantages of different methods, the dilemma be- 
tween scanning density and measurement accuracy in conventional optical 
surface profilometry can be tackled. For the pre-measurement stage, a com- 
pressive SFF method has been developed to generate raw measurement result 
of the surface profile using a small number of frames. Each frame is compres- 
sively captured as a linear combination of all focal planes within the measure- 
ment volume. Reconstruction of the SFF signal is directly performed in the 
focus measure space. Experimental results demonstrate an acquisition speed 
7 times faster than conventional SFF when an estimation of the object axial 
position can still be correctly performed. As one candidate method for the 
main measurement stage, the iterative array adaptation method is based on 
the conventional confocal array scanning method. Multiple iterations of array 
scanning are performed while the array density and the axial measurement 
range are dynamically adjusted from iteration to iteration. The mean absolute 
error with respect to benchmark measurement performed by the conventional 
array scanning method is 14.7 um. Another candidate method for the main 
measurement stage is the direct area scanning method based on a tilted illumi- 
nation field. It has been demonstrated both theoretically and experimentally 
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that when an area illumination field is tilted to a specific angle according to 
the NA of the system, the confocal signal, which would disappear in a wide- 
field microscope, can be largely preserved, providing a cue for the 3D surface 
profile. Compared to the conventional array scanning method with a pitch 
of 20 pixels, the axial FWHM of the direct area scanning method is doubled, 
indicating a moderately reduced sensitivity. Nevertheless, compared to array 
scanning with a pitch of 10 pixels, the measurement speed is more than 100 
times faster, while achieving a comparable measurement accuracy. Last but 
not least, for post measurement refinement, localized confocal scanning based 
on Bayesian Experimental Design has been discussed. Conventional Bayesian 
Experimental Design is computationally intensity due to the requirement of 
a nested MCMC sampling when calculating the utility function. Simulation 
results has demonstrated that this process can be much accelerated through 
the approximation by an RNN, which is able to achieve performance between 
uniform sampling and BED-enabled sampling. Compared to the full BED, the 
calculation of the utility function is accelerated by a factor of 600 by the RNN- 
based BED. 


Through the combination of these different methods, the information regard- 
ing the 3D surface profile of the target object can be acquired in a much more 
efficient way. Due to the intrinsic adaptability of the AdaScope, the prior 
information regarding the test target, such as the CAD model, can be easily 
incorporated into the measurement process to further accelerate the measure- 
ment speed. Additionally, the measurement range of the system can be dy- 
namically adjusted for individual applications. These characteristics grant the 
AdaScope the unique capability to swiftly adapt to different inspection tasks, 
which matches the need of a smart factory in the era of Industry 4.0. 


6.2 Outlook 


Although impressive results have been reported based on the AdaScope plat- 
form, its performance is still bounded by several limitations. As an outlook, 
potential improvements in future research are identified and discussed. 
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Incorporation of New Measurement Mode 


First of all, thanks to the adaptability of the AdaScope system, new measure- 
ment principles can be easily implemented and incorporated. One particu- 
larly interesting area of research is chromatic confocal spectral interferom- 
etry (CCSI). The CCSI technology is first proposed by Papastathopoulos et 
al. [Pap06] as a novel method for topography measurements, which combines 
the techniques of spectral interferometry and chromatic confocal microscopy. 
It is the first interferometric method that utilizes a confocally filtered and 
chromatically dispersed focus for detection. Unlike white light interferom- 
etry, the depth range of the sensor is decoupled from the NA of the micro- 
scope objective with the chromatically dispersed focus. As an interferometric 
method, the measured signal provides an even higher sensitivity compared 
to the confocal measurement. The beam trap in the current AdaScope setup 
(Figure can be replaced by a switchable reference arm with phase com- 
pensation. This would allow a final stage of interferometric measurement to 
be incorporated into the cascade measurement strategy. 


Better Illumination 


Secondly, the illumination system can be further improved. This is critical 
to the AdaScope system as the quality of the illumination directly affects the 
measurement accuracy. On one hand, the spectral accuracy of the illumina- 
tion generation can be increased through a new calibration procedure. Cur- 
rently, the spectral responses are captured for a series of scanning macro pix- 
els for calibration. Due to the small intensity of the light on a single macro 
pixel, the recorded response suffers from a low SNR mainly due to the photon 
noise. Kang et al. proposed a novel calibration procedure for DMD- 
enabled programmable optical filter based on Hadamard transform. Instead of 
scanning individual pixels, patterns generated from Hadamard transform are 
projected onto the DMD while the respective spectral responses are recorded. 
With much higher intensity of the reflected light, the spectrum measurement 
benefits from a higher SNR. Compared to sequential scanning, spectra gen- 
erated through calibration based on Hardamard transform are more accurate, 
particularly when the number of channels (pixels) are higher. Although the 
proposed method is only applied in a 1D DMD chip with a relatively small 
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number of channels, the principle applies to a 2D DMD as well. To adopt this 
calibration procedure in the AdaScope system, modifications to the algorithm 
has to be developed to account for the higher requirement of computation due 
to the larger number of pixels. Apart from the spectral accuracy, the spatial 
homogeneity of the light projected onto the spatial DMD is also of great im- 
portance, particularly for methods using complex illumination spectra, such 
as the CSFF method and the iterative array adaptation method. To improve 
the homogeneity of the illumination field, customized microlens arrays with 
a smaller lenslet pitch and a higher number of periods can be applied. 


The superior adaptability of the AdaScope system originates from its ability 
to generate different 3D illumination fields. Nevertheless, time-multiplexing 
techniques are highly involved in the current system architecture. Although 
not intrinsically required by any methods proposed in this dissertation, the 
time-multiplexing procedure directly limits practical performance ofthe Ada- 
Scope. The homogenized light from the programmable light source is pro- 
jected to the complete spatial DMD, where switchable pixels serve as sec- 
ondary light sources. Under such circumstances, illumination spectra of dif- 
ferent lateral positions can not be adjusted simultaneously. As discussed in 
Section the control of the illumination is split to the control of the tem- 
poral illumination spectrum and the control of the temporal spatial DMD pat- 
tern. For the measurement methods proposed in this dissertation, one of the 
aforementioned components is fixed empirically while the other component 
is derived from the desired complete illumination field. For example, in the 
direct area scanning method, a series of scanning wavelength peaks is fixed 
before the corresponding spatial DMD patterns are derived. Although effi- 
cient enough for most methods where a regular (e.g., symmetric, periodic, 
global, etc.) illumination field is required, this empirical procedure becomes 
very inefficient when the required illumination field has a more complex spa- 
tial distribution. Thus more rigorous algorithms should be developed in the 
future for the decomposition of the illumination matrix/tensor, with the target 
of minimizing the total time of illumination. Toward a more distant future, 
hyperspectral displays can be built based on technologies such as quantum 
dot, enabling simultaneous adjustment of the illumination spectra for differ- 
ent lateral locations. 
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Implementation of Localized Scan 


In Chapter Bl only the pre-measurement stage and the main measurement 
stage of the cascade measurement strategy are evaluated through practical 
experiments. For post-measurement refinement, the idea of dynamic local- 
ized scan based on Bayesian Experimental Design has been discussed in Sec- 
tion k4 Nevertheless, this method is only analyzed through simulation for 
two reasons. On one hand, as discussed previously, the current illumination 
generation scheme relies heavily on time-multiplexing, which makes the lo- 
calized scan very inefficient. To have a meaningful implementation of any lo- 
calized scanning method, a dramatically different hardware configuration has 
to be designed and developed, allowing simultaneous spectral controls of in- 
dividual lateral locations. On the other hand, despite the great speed improve- 
ment of RNN-based BED compared to full BED based on MC sampling demon- 
strated by the simulation, the inference of the utility function over a relatively 
small number of designs for a single lateral location still takes 0.028 s. For the 
full HD resolution of the spatial DMD, the inference would take more than 
one hour to complete. This problem can be partly circumvented by the next 
generation GPU technology, such as RTX 2080 by Nvidia. More importantly, 
new neural network architectures remain to be investigated to improve the 
performance of the proposed method. For example, parallelization of multi- 
ple lateral locations has a high potential considering the increasing RAM of 
the graphic card. Meanwhile, information from adjacent locations can also be 
incorporated to increase the accuracy of the inference. 
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A.1 Camera Calibration 


To target of the camera calibration process is to get the correspondence be- 
tween the spatial DMD coordinate system and the camera coordinate system, 
so that the dynamic control of the DMD can be derived based on camera im- 
ages. The process is based on a pinhole camera model as presented in the work 


by Zhang (Zha00] and implemented using the OpenCV library in Python. 


Firstly, a single wavelength illumination is projected onto the spatial DMD 
and a broadband mirror perpendicular to the optical axis is positioned axially 
so that the image of the DMD pattern is sharply focused on the camera image. 
Secondly, the spatial DMD displays a point array pattern with a pitch distance 
of 20 pixels. Lastly, the image of the point array is captured by the camera. 


The coordinates of the point array in the camera image is retrieved by calcu- 
lating the intensity centroid position inside a region of interest with a width 
of 15 camera pixels centered around the brightest pixel for each point. With 
the list of DMD coordinates and the list of camera coordinates for the point 
array, the rotation vector, the translation vector, the camera matrix as well 
as distortion coefficients are calculated based on existing method from the 
OpenCV library. 


Based on the calibration result, the DMD pixels can be projected onto the cam- 
era plane with their corresponding intensities interpolated from the camera 


image. 


145 


A Appendix 


A.2 Bernstein Polynomials 


For the measurement method based on linearly weighted filters, a measure- 
ment matrix constructed with the Bernstein polynomials has the property of 
maintaining the normalized centroid position of the underlying signal. The 
proof is given in this section. 


The n + 1 Bernstein basis polynomials of degree n are defined as 


n = 
by n(t) = (ra = t)” : 


v=0,..,n t€ [0,1]. 


(A.1) 


The signal is represented by a vector x = (X1,...,x,)' and the measurement 
can is represented by y = (yo, ---,Yn)! with n + 1 channels. Suppose both the 
signal and the measurement are spanned on a support of [0,1], the normalized 
centroid of the signal and the measurement can be expressed as 
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If each channel of the measurement is based on the respective Bernstein poly- 
nomial, the result can be expressed as a linear combination of the signal com- 
ponents 


7 i-1 
Mire 5 by ns) xi: (A.3) 
i=1 


Therefore, the centroid of the measurement can be rewritten as 
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All Bernstein polynomials of a certain degree forms a partition of unity: 


3 by n(t) = Goo Sy Say Ss (A.5) 


Meanwhile, it can be shown that 
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Based on Equations A.d A.3l and EE: it can be proven that the normalized 


centroid of the measurement equals that the of signal: 
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